permaculture playing cards
The moose likes I/O and Streams and the fly likes Unicode file reading issue Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "Unicode file reading issue" Watch "Unicode file reading issue" New topic

Unicode file reading issue

Subin Martin

Joined: Aug 08, 2010
Posts: 5
Hi All,

I am facing an issue which I couldnt solve after much repeated search. Hope you all can help me.
I have a txt file(ASCII) containing the decimal representation of an Unicode exកម្មវិធី)
I have a function which will convert this decimal value to hexadecimal unicode ie in this case (\u1780\u1798\u17d2\u1798\u179c\u17b7\u1792\u17b8)

The logic is that I take each line from the file, convert it into unicode hex representation and then write to a JPG file.
The issue here is with the hexValue variable. Its getting the correct value but when I use it in the function call i.e. ConvertUnicode(image, hexValue); its not working. If I equate the variable with the corresponding value(ie \u1780..) its working fine. I checked for extra carriage returns, spaces etc but still issue is there.

I tried printing decoded = "-"+decoded+"-"; in the function and got -\u1780\u1798\u17d2\u1798\u179c\u17b7\u1792\u17b8- and showing no extra characters.

Am I doing some mistake in the file encoding part? I tried with UTF encoding and read appropriately but still same issue is there. I would be very thankful for any pointers regarding this.

Paul Clapham

Joined: Oct 14, 2005
Posts: 19973

I didn't read much of your code (in future put it into the "Code" tags so it's readable). But I think you are taking the way a character can be represented as a Unicode escape in a Java string literal, and assuming it makes sense in other contexts. It doesn't. The only reason you would want to represent a character as 6 characters starting with backslash-U is when it's going to be in a string literal in Java source code.

Or perhaps I misunderstood your question. You lost me at "the decimal representation of a Unicode"... what was that supposed to mean?
Subin Martin

Joined: Aug 08, 2010
Posts: 5
Hi Paul,

Thanks a lot for your help. I got the issue. Now I am able to get it working just by using the hex values and not the character escaped values. Thanks again.

I agree. Here's the link:
subject: Unicode file reading issue
jQuery in Action, 3rd edition