According R&H, UTF encoding uses as many bits as needed to encode a character. (P439). However, Bill Brodgen said in his exam cram book P.287, a single character (using UTF-8 encoding scheme) may end up encoded in one, two or three bytes, but not more. Which version should we follow in the real test? There are self-test questions based on the above point in both books. I also checked Khalid book, which said "the UTF8 encoding has a multi-byte encoding format P570". So he can't be wrong both ways. Please also note the three books use different format to refer to this encoding scheme, which itself might reflect my point. R&H: UTF Brodgen: UTF-8 Khalid: UTF8 Can someone shed more light? [This message has been edited by Tom Tang (edited February 11, 2001).]
I checked www.unicode.org and find out there are UTF-8, UTF-16 and UTF-32 encoding. Maybe that answered the question. But I still anybody who have better knowledge to shed more light.
Joined: Dec 24, 2000
As usual, I found the answer at Maha Anna's discussion page: Java uses a system called UTF for I/O to support international character sets True. Java uses a conversion method called UTF-8 which is a subset of UTF. Subset in the sense, in true UTF a char can be encoded from 1 byte to ANY no of bytes. This means we can cover ALL CHARS IN ALL LANGUAGES IN THE WORLD. So UTF is true transformation. This means a small char can have lesser no of bytes , at the same time a BIG-LOOK&FEEL ( ) asian char may be encoded with many no. of bytes. Since in Java all chars can have max 16 bits,(Unicode char) , All IO operations which need char transformation of bytes (All readers/writers) uses a pre-defined transformation format. (i.e) a char can be encoded to 1 or 2 or 3 bytes ONLY . max 3 bytes. There are some rules which chars are encoded with how many no of bytes. It is in Java Doc. I also found a error in Bill brogden's book recently here which illustrates this concept. the link www.javaranch.com/maha/Discussions/java_io_Package/true-false_-_JavaRanch_Big_Moose_Saloon.htm [This message has been edited by Tom Tang (edited February 12, 2001).]