my dog learned polymorphism*
The moose likes Beginning Java and the fly likes Converting String to Unicode Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "Converting String to Unicode" Watch "Converting String to Unicode" New topic
Author

Converting String to Unicode

Jhakda Velu
Ranch Hand

Joined: Feb 26, 2008
Posts: 166
Hi
I ahve to convert a string to Unicode. I am doing it as below

char[] characters = strToUnicode.toCharArray();
for (int i = 0; i < characters.length; i++) {
char c = characters[i];
sbStrToUnicode.append((int) c);
sbStrToUnicode.append(" ");
}

where strToUnicode is the string to convert (say Java Ranch) and
sbStrToUnicode is the StringBuffer to which i add the data

This gives me corect result.
I want to know if this is the most optimised way to do it.
It may sound a bit stupid but i want to do it the best possible way as i have to convert lot of data to unicode.


Jhakda


If I become filthy rich, I'll sponsor research for painless dental treatment at Harvard Medical School. Thats why,I'm learning Java.I have 32 teeth, 22 are man made.
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Um, Java Strings are already in Unicode. Or they should be. I'm guessing that perhaps your data got corrupted when it was read, or when it's written, and what you're doing now is repairing damage that was done elsewhere. If that's the case, it would probably be more effective to figure out how the damage occurred in the first place. Where are these strings getting read from, and where are they being written to? A file? A database? What java.io classes are you using to read the data? Probably you need to find classes methods that allow you to specify a character encoding.

Also, your method probably won't work correctly for non-European languages (and not even all European languages). If you have any Asian data, for example, you should try processing that to see if the results are acceptable.


"I'm not back." - Bill Harding, Twister
Jhakda Velu
Ranch Hand

Joined: Feb 26, 2008
Posts: 166
Hi
Thank you for you reply.I think i did not explain my problem correctly.

Actually i read data from the DB which contains Japanese and Chinese characters. I do a resultset.getString(ColName) to get this data. Now i have to convert this to Unicode before i send the same to client side for processing. If i send without converting, it shows up as junk data.
So is the method adopted by me correct?

Thanks
Jhakda
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41116
    
  45
As Jim Said, Java strings are in Unicode. The question is - how are they getting to the client? Are you specifying a character encoding like UTF-8 as transfer encoding?


Ping & DNS - my free Android networking tools app
Jhakda Velu
Ranch Hand

Joined: Feb 26, 2008
Posts: 166
Hi
I connect to the appserver through XMLHttp.
Yes,while sending data from client, as i send it as XML, i mention UTf-8.
But while sending back, i pass it in the response text.
If i don't convert, it shows up as junk.
if i convert an get back in normal form in client, it works fine.

Thanks

Jhakda
Jhakda Velu
Ranch Hand

Joined: Feb 26, 2008
Posts: 166
Hi All
I'll give some more details about my problem.
Well, fetching data and all is done at the server level,which is done in a DAO written in java. But my client is Excel and can be word too.
SO if i send back data without converting to unicode, i get strange characters. I got help of a Japanese guy who suggested using unicode. No there is no issue in network transmission. Is there a problem if we send Japanese and Chinese characters in Microsoft products?

Jhakda
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41116
    
  45
How do you know that there is no problem during transmission - have you made sure that all sides use the same transfer encoding?

How do Word and Excel use XmlHttp - did you not mean the JavaScript XmlHttpRequest object by that? If not, what is XmlHttp?
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

You have a database? And you're extracting data from it and displaying it in a web application, and maybe getting data back from the user and putting that into the database? Then you must read Character Conversions from Browser to Database.

And do remember that Java strings are already in Unicode. I suspect you have the usual naive definition of Unicode (something like "characters I don't use in my language") so it might be a good idea to read The Absolute Minimum Every Software Developer Must Know About Unicode as well.
Raghavan Muthu
Ranch Hand

Joined: Apr 20, 2006
Posts: 3344

Originally posted by Jhakda Velu:
..Yes,while sending data from client, as i send it as XML, i mention UTf-8.
But while sending back, i pass it in the response text.
If i don't convert, it shows up as junk.


If i am right, by default the response will contain its 'contentType' as 'text/html'. In that case, obviously it will show as junk characters only.


if i convert an get back in normal form in client, it works fine.


What is this line means? How do you process it once you get the data?


Everything has got its own deadline including one's EGO!
[CodeBarn] [Java Concepts-easily] [Corey's articles] [SCJP-SUN] [Servlet Examples] [Java Beginners FAQ] [Sun-Java Tutorials] [Java Coding Guidelines]
Jhakda Velu
Ranch Hand

Joined: Feb 26, 2008
Posts: 166
Hi All
Thanks all for your responses.
Well, i don't use a browser. I have excel as front end and i need to fetch data from the DB. I connect to the App server thru a XMLHttp call. This hits teh servlet running on the app server and from htere flow is like a normal web app (service--> DAO--> DBand back). And a correction, i write the data i get into the ServletOutputStream as a String.

This data is got back at the client(excel) and displayed after doing some processing using vb macros.

The reason i say there is no error in transmission(though i can't say that i'm 100% sure) is that i do the tests on Appserver running on my machine,and i'd have tested around 50 times each with and without converting each literal i get from the db. Each time i do the conversion,it shows up fine and doesn't work otehrwise.
I'd go through the links mentioned to me anyway, but i would like to ask waht would happen if i do a resultset.getString (1) where col number 1 contains Chinese/Japanese data in the DB.
Please don't ask me not to use Excel as the front end as the users are hell bent on it and the person who gathered the requirements couldn't convert them


Jhakda
[ April 09, 2008: Message edited by: Jhakda Velu ]
Jhakda Velu
Ranch Hand

Joined: Feb 26, 2008
Posts: 166
Hi All
To add more information, intially i thought that as i'm running an English version of the OS, Japanese data is displayed incorrectly because of it(it appears as a ? but upside down). I then got a system running Japanese version of the OS, but the problem remained.

Thanks
Jhakda
Raghavan Muthu
Ranch Hand

Joined: Apr 20, 2006
Posts: 3344

Originally posted by Jhakda Velu:
Well, i don't use a browser. I have excel as front end and i need to fetch data from the DB.


This all varies as with the output device (target).Instead of being displayed in browser, you are rather giving it to Excel if i am right.


..And a correction, i write the data i get into the ServletOutputStream as a String.



This is what i have told. In a Servlet, the default output setting will be of "text/html". You need to update it with



It is the same as Ulf Dittmer suggested as well.

Perhaps these two links may help you.

  • Servlet API on - setContentType
  • Why does my Servlet request lose UTF-8 encoding in passed-in attributes?


  • Hope this helps!

    [ April 10, 2008: Message edited by: Raghavan Muthu -- edited the URL String ]
    [ April 10, 2008: Message edited by: Raghavan Muthu ]
    Jhakda Velu
    Ranch Hand

    Joined: Feb 26, 2008
    Posts: 166
    Hi All
    Thanks a lot for your suggestions
    I will set the content type and test. Will keep you all updated.

    Thanks Again

    Jhakda
    Jhakda Velu
    Ranch Hand

    Joined: Feb 26, 2008
    Posts: 166
    Hi All
    Eureka !!!
    It worked
    Thanks a lot for the guidance. Spl thanks to Elf and Raghu

    Each day i come to know how little i know

    Jhakda
     
    I agree. Here's the link: http://aspose.com/file-tools
     
    subject: Converting String to Unicode
     
    Similar Threads
    Char into Integer
    Converting Strings to ints and other impracticalities
    Creation of char type during runtime
    It seems so simple, Help !
    Understanding Byte Data and Character Encoding