aspose file tools*
The moose likes Programmer Certification (SCJP/OCPJP) and the fly likes How many bits are there for UTF characters? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Certification » Programmer Certification (SCJP/OCPJP)
Bookmark "How many bits are there for UTF characters?" Watch "How many bits are there for UTF characters?" New topic
Author

How many bits are there for UTF characters?

weiliu lili
Ranch Hand

Joined: Apr 11, 2002
Posts: 46
How many bits are there for UTF characters?
Thomas Kijftenbelt
Ranch Hand

Joined: Feb 13, 2002
Posts: 73
Hi,
UTF-8 uses 1-3 bytes per character (the number of bytes depends on the character).
Greetings,
TK
SCJP
Jamal Hasanov
Ranch Hand

Joined: Jan 08, 2002
Posts: 411
24 bits
Jamal Hasanov
www.j-think.com
John Dale
Ranch Hand

Joined: Feb 22, 2001
Posts: 399
I presume by UTF, you are talking about the widely used UTF-8. UTF-8 uses 1 to 3 bytes, or 8 to 24 bits, per Unicode character, depending on the character.
There are other UTF formats, like UTF-16, that represent the data differently. UTF-16 uses 16 bits per character.
UTF-16 has the advantage of having all the characters the same size, while UTF-8 usually takes less space, at least if most of the characters can be encoded in 8 bits, like the displayable ASCII characters.
UTF-8 is more likely to be used when it is known that the data will be access serially, as when it is sent across the network. UTF-16 is used when the data might be access in random order, as in a file. For example, Windows NT/2000 use UTF-16 to store Unicode data on disk.
For an introduction to Unicode and encoding, you might look at The Unicode´┐Ż Standard: A Technical Introduction.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: How many bits are there for UTF characters?