File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes I/O and Streams and the fly likes UTF-8  encoding Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "UTF-8  encoding" Watch "UTF-8  encoding" New topic

UTF-8 encoding

hanihanan younis
Ranch Hand

Joined: Jul 06, 2005
Posts: 36
hi all,
i convert arabic words+english numbers to binary utf-8
byte utf8[] = message.getBytes("UTF-8");


where message as i said before is a string of arabic words & english numbers..
when a J2ME application read it it throws UTF8FormatException..
& i read that may be because it shouldnt be mix of different charset..
now i want solution for this please
thanks any one read it & wish to solve it..
Jim Yingst

Joined: Jan 30, 2000
Posts: 18671
Did you mean a UTFDataFormatException? The documentation for this error actually tells you the problem, though it's subtle. Did you write the data using a DataOutputStream, using the writeUTF() method? The thing to realize is that this method uses modified UTF-8. and more importantly, the first two bytes written aren't in UTF-8 at all - they're an unsigned number representing the length of the string which will be written. If you want to read data that was written using writeUTF(), the best way is to use the corresponding method in DataInputStream, readUTF(). This method understands how to use the two bytes of length info and the details of modified UTF-8. Other classes and methods which use "real" UTF-8 (unmodified, with no length info) will not understand the output of writeUTF(). It's unfortunate that Sun chose to name these methods with "UTF" in the name, as it misleads people into thinking they're using real UTF rather than a custom variation.

"I'm not back." - Bill Harding, Twister
I agree. Here's the link:
subject: UTF-8 encoding
It's not a secret anymore!