File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

UTF-decoding issue with java

 
Arun Chaitanya
Greenhorn
Posts: 1
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I fetch data from DB which has UTF-8 encoding for a column, When i read it in the form a string the value is in the form of \u00f1, while i expect it to be decoded. Can you help me out on how i am supposed to read this character set such that when i read i read the decoded string instead of unicode characters
 
Mark Vedder
Ranch Hand
Posts: 624
IntelliJ IDE Java
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to JavaRanch.

When you are creating your database connection, are you explicitly setting it to use UTF-8 for encoding? If not, it is likely not defaulting to UTF-8.

Typically you will need to set this on your DataSource. It will vary from database implementation to database implementation. With MySQL, for example, there is a setEncoding(String) method. Many databases (or DataSource implementations) allow you to set the encoding as part of the URL string used when creating the DataSource. For example, in MySQL, you can do this:



The syntax used will vary. Check your Database documentation for more information.

[edit]
After rereading your post, I see you stated "from DB which has UTF-8 encoding for a column". You mentioned a column rather than the DB. The above applies only if the database itself is encoded using UTF-8. And it won't solve the issue if you are storing the actual string "\u00f1" in the column for the reasons Ulf mentions below.
[ December 30, 2008: Message edited by: Mark Vedder ]
 
Ulf Dittmer
Rancher
Pie
Posts: 42966
73
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to JavaRanch.

Do you mean that the data that's stored in the DB literally includes something like "\u00f1"? That would be a Java-specific notation of Unicode characters that really should not be stored like this in a DB. I'd advise to store proper Unicode characters instead (set the DB encoding accordingly).
 
Campbell Ritchie
Sheriff
Pie
Posts: 47228
52
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sounds like a database-specific problem; it would sit better in our JDBC forum. Moving.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic