This week's book giveaway is in the OO, Patterns, UML and Refactoring forum.
We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line!
See this thread for details.
The moose likes Programmer Certification (SCJP/OCPJP) and the fly likes How many bits are there for UTF characters? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login

JavaRanch » Java Forums » Certification » Programmer Certification (SCJP/OCPJP)
Bookmark "How many bits are there for UTF characters?" Watch "How many bits are there for UTF characters?" New topic

How many bits are there for UTF characters?

weiliu lili
Ranch Hand

Joined: Apr 11, 2002
Posts: 46
How many bits are there for UTF characters?
Thomas Kijftenbelt
Ranch Hand

Joined: Feb 13, 2002
Posts: 73
UTF-8 uses 1-3 bytes per character (the number of bytes depends on the character).
Jamal Hasanov
Ranch Hand

Joined: Jan 08, 2002
Posts: 411
24 bits
Jamal Hasanov
John Dale
Ranch Hand

Joined: Feb 22, 2001
Posts: 399
I presume by UTF, you are talking about the widely used UTF-8. UTF-8 uses 1 to 3 bytes, or 8 to 24 bits, per Unicode character, depending on the character.
There are other UTF formats, like UTF-16, that represent the data differently. UTF-16 uses 16 bits per character.
UTF-16 has the advantage of having all the characters the same size, while UTF-8 usually takes less space, at least if most of the characters can be encoded in 8 bits, like the displayable ASCII characters.
UTF-8 is more likely to be used when it is known that the data will be access serially, as when it is sent across the network. UTF-16 is used when the data might be access in random order, as in a file. For example, Windows NT/2000 use UTF-16 to store Unicode data on disk.
For an introduction to Unicode and encoding, you might look at The Unicode� Standard: A Technical Introduction.
I’ve looked at a lot of different solutions, and in my humble opinion Aspose is the way to go. Here’s the link:
subject: How many bits are there for UTF characters?
It's not a secret anymore!