• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Java Utf-16 limitation.

 
Sharon whipple
Ranch Hand
Posts: 294
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I was wondering how J2ee applications than need's to display Unicode text handle the Utf-16 limitation?
Is there anything new in Java 6?

Thank you
Sharon
 
Jesper de Jong
Java Cowboy
Saloon Keeper
Posts: 15205
36
Android IntelliJ IDE Java Scala Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What is "the Utf-16 limitation"?
 
Sharon whipple
Ranch Hand
Posts: 294
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
java supports only UTF-16 strings.
 
Peter Chase
Ranch Hand
Posts: 1970
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Are you imagining that UTF-16 only supports 16 bits worth (65536 max) of characters? If so, that is incorrect.

In a similar way to UTF-8, the thousands of additional characters are handled by escape codes, which introduce additional 16-bit words, to describe the extended characters. I forget exactly how it works, but go look at www.unicode.org.

In Java, one does sometimes have to take care, because some character-related methods report the number of 16-bit Java chars, rather than the number of Unicode characters. Again, I forget exactly how, but go look at the Java String API in detail.

Some code (particularly if it is old) might have trouble in some locales, if it assumes that the number of Java chars is the number of characters.
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The http://faq.javaranch.com/java/JavaIoFaq links to two blog articles on how to deal with characters beyond 16 bit.
 
Sharon whipple
Ranch Hand
Posts: 294
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
To be more precise, the question should be : how do large scale apps use web servers and yet support UTF-8.
Web servers : apache,web-logic,tomcat,jboos,ias etc.
Thank you
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Java supports many encodings, UTF-8 and ISO-8859 amongst them. If a program (web browser, database, ...) needs to get text in other encodings out of Java code, that's no problem at all. UTF-16 just happens to be the one in which strings are stored internally. (Come to think of it, I've never seen a web page served in UTF-16, or a database set up to use UTF-16, so if Java couldn't handle other encoding, that would be a major limitation.)
 
Sharon whipple
Ranch Hand
Posts: 294
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
if Java couldn't handle other encoding, that would be a major limitation.)


http://java.sun.com/docs/books/tutorial/i18n/text/convertintro.html
Java is unable to handle UTF-8,
IE can handle Utf-8 encoding, but when html form submitted, the web server internal convert the text to UTF-16, (request/response objects are java UTF-16 Strings)
 
Jesper de Jong
Java Cowboy
Saloon Keeper
Posts: 15205
36
Android IntelliJ IDE Java Scala Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Sharon whipple:
Java is unable to handle UTF-8,

As already said above, that's not true. Just because Java stores characters in UTF-16 internally does not mean that Java is unable to handle UTF-8. The supported encodings page gives a list of character encodings that Java supports.
 
Sharon whipple
Ranch Hand
Posts: 294
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Pure java String class is UTf-16
Web containers/servers build on java are unable to handle UTF-8

Is that correct?
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Web containers/servers build on java are unable to handle UTF-8

Is that correct?


As both Jesper and I have pointed out, no, that is not correct.
[ October 18, 2007: Message edited by: Ulf Dittmer ]
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic