• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

UTF-8 encoding

 
hanihanan younis
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi all,
i convert arabic words+english numbers to binary utf-8
byte utf8[] = message.getBytes("UTF-8");

response.setContentLength(utf8.length);

out.write(utf8);
out.flush();
out.close();
where message as i said before is a string of arabic words & english numbers..
when a J2ME application read it it throws UTF8FormatException..
& i read that may be because it shouldnt be mix of different charset..
now i want solution for this please
thanks any one read it & wish to solve it..
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Did you mean a UTFDataFormatException? The documentation for this error actually tells you the problem, though it's subtle. Did you write the data using a DataOutputStream, using the writeUTF() method? The thing to realize is that this method uses modified UTF-8. and more importantly, the first two bytes written aren't in UTF-8 at all - they're an unsigned number representing the length of the string which will be written. If you want to read data that was written using writeUTF(), the best way is to use the corresponding method in DataInputStream, readUTF(). This method understands how to use the two bytes of length info and the details of modified UTF-8. Other classes and methods which use "real" UTF-8 (unmodified, with no length info) will not understand the output of writeUTF(). It's unfortunate that Sun chose to name these methods with "UTF" in the name, as it misleads people into thinking they're using real UTF rather than a custom variation.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic