• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Java Character Encodint

 
j_mcd
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
From Marcus Green site:
For reasons of compactness Java uses a system called UTF-8 for string literals, identifiers and other text within programs.
From Roberts, Heller:
Java uses two kinds of text representation:
Unicode for internal representation of characters and strings

I'm assuming one of these statements is incorrect. Could someone help?
Thanks!
 
Angela Lamb
Ranch Hand
Posts: 156
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I may be wrong, but I think the Marcus Green quote is referring to the actual text of the Java code, while the RHE quote is talking about character and string variables within a program.
 
Jane Griscti
Ranch Hand
Posts: 3141
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
j_mcd,
Unicode is a 16-bit international character set standard which consists of a standard 8-bit subset that corresponds to the older ASCII and Latin-1 character sets (the ones most commonly used under DOS).
UTF-8 is the standard encoding name used to identify text stored as Unicode 8-bit characters. UTF-16 is the standard encoding name used to identify text stored as Unicode 16-bit characters.
If you use a Win/DOS based system, the source files you create will most likely be saved as 8-bit characters. Java translates these into 16-bit Unicode before processing the source file.
Hope that helps.
PS Please read the JavaRanch Name Policy
and re-register using a name that complies with the rules.
Thanks for your cooperation.
------------------
Jane Griscti
Sun Certified Programmer for the Java� 2 Platform
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic