File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
Win a copy of Clojure in Action this week in the Clojure forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

How can i transform a unicode String into a japanese String??

 
Javan Li
Ranch Hand
Posts: 84
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a unicode String , which is a japanese word, but i want to compare it with a real japanese word , How can i transform it into a japanese word? thanks very much!
 
Thomas Paul
mister krabs
Ranch Hand
Posts: 13974
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A "real Japanese" word? What does that mean? All characters are stored in unicode in Java.
 
Javan Li
Ranch Hand
Posts: 84
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I mean i use oracle which charset is "UTF-8". And i have to insert a String which is a japanese String from a xml file which encoding="Shift_JIS" , So after insert , i read this String from the oracle , and compare it with the one in XML file. they are not equal. i wonder how can i do something to make them equal. i mean do something with the String in database. Thanks very much!
[ June 18, 2003: Message edited by: Javan Li ]
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If the two objects have been correctly converted into Strings (which by definition use Unicode internally) then equals() will return the correct answers here. Meaning either (a) one of the two Strings was converted incorrectly, (b) the two strings really are not equal, or (c) someone lied to you about the character sets which are being used. Based on my own experience, it's usually (c). But don't overlook (b). In this case it's difficult for me to imagine how your could screw up the conversion yourself - when you read from the XML file, most parsers should take care of the character conversion for you (after reading the encoding declaration from the file itself). And if the database is set up correctly, using ResultSet's getString() method should likewise handle the conversion.
Here's what I'd do: for each of the two "different" Strings you're dealing with, write the value to a small file with a .html extension. Use UTF-8 to create the OutputStreamWriter. Then look it the files with your browser - go to View -> Character Encoding to make sure it's using UTF-8. Do the files look the same, or not? Show the files to someone who understands the language and data, if you don't - does either of the files display the "correct" value? This can help guide you to where the problem is - is the XML data bad, or the XML parsing, or the Oracle data, or the Oracle data, or the Oracle configuration (wherever the charset is declared)? There are many places something could be wrong here, but the place to fix it is almost certainly before you've converted the data into a Java String.
 
Javan Li
Ranch Hand
Posts: 84
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks , Jim.
And i find the problem. because the field which store the long String is long type. But when i insert this String into oracle. i didnt' use the right InputStreamReader . So even it looks the same between these two String. They are not equal. And i fix this bug. Thanks!
 
I agree. Here's the link: http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic