aspose file tools*
The moose likes Beginning Java and the fly likes unicode into char Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of EJB 3 in Action this week in the EJB and other Java EE Technologies forum!
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "unicode into char" Watch "unicode into char" New topic
Author

unicode into char

Adrian Stent
Greenhorn

Joined: Nov 19, 2004
Posts: 4
I thought a unicode value had to be in single qoutes when assigned to a char. For some reason, the following seems to compile and run ok, but only in the range of \00030 to \u00039 .

char d = \u0032; /* Compiles ok */
System.out.print(d);

char d = \u0040; /* Syntax error on token "Invalid Character", invalid VariableInitializer comes up */

Hope someone can explain this to me why single qoutes are not needed in the first example.
Joe Ess
Bartender

Joined: Oct 29, 2001
Posts: 8708
    
    6

char is actually a numeric value:

4.2.1 Integral Types and Values
The values of the integral types are integers in the following ranges:

* For byte, from -128 to 127, inclusive
* For short, from -32768 to 32767, inclusive
* For int, from -2147483648 to 2147483647, inclusive
* For long, from -9223372036854775808 to 9223372036854775807, inclusive
* For char, from '\u0000' to '\uffff' inclusive, that is, from 0 to 65535

The Java Specification
It just so happens that \u0030 to \u0039 is Unicode for decimal 0-9. The compiler goes through on the first pass and resolves Unicode literals to their character equivalent, effectively changing your Unicode "characters" to integers, and from then on treating the declarations as perfectly legal integer assignments.
[ February 10, 2005: Message edited by: Joe Ess ]

"blabbing like a narcissistic fool with a superiority complex" ~ N.A.
[How To Ask Questions On JavaRanch]
Adrian Stent
Greenhorn

Joined: Nov 19, 2004
Posts: 4
Cheers for that, I think I'm almost there, but confused when a dot is displayed instead of a number.
e.g char b = \u0031;
System.out.println(b); // This should print out 1 instead of the dot.
Joe Ess
Bartender

Joined: Oct 29, 2001
Posts: 8708
    
    6

You are skipping a step in your mental compilation. If you write this:

The Java compiler changes the Unicode literal into it's character equivalent (all Java files are assumed to be Unicode):

Now this is setting the integral char to a value of 0x01 (or \u0001 if you want to stick with Unicode). If you print out b, you will be printing out a "Start of Heading" character (smiley face).
The above line is not the same thing as the following line:


The first example, x, is using a character literal to set the integral char to a value of 0x31 (or \u0031). The second example is using a Unicode literal to do the same.
Now if you print out x or y, you will print out "1". The lesson here is that char values aren't really "characters", they are integral values which are interpreted as "characters" in the correct context. Perhaps a gander at the Unicode Character Table is in order?
abalfazl hossein
Ranch Hand

Joined: Sep 06, 2007
Posts: 606


output:
633


http://www.fileformat.info/info/unicode/char/633/index.htm

C/C++/Java source code "\u0633"


It is only 633,But in that page it is \u0633.

What is the reason for this difference?What about this 'u' and '0'?
Kurt Van Etten
Ranch Hand

Joined: Sep 07, 2010
Posts: 98
abalfazl hossein wrote:
It is only 633,But in that page it is \u0633.

What is the reason for this difference?


In your source code you need to let the compiler know that you're specifying a hex or unicode value instead of an integer, by doing something like this:

But the actual hex value is just 633, so that is what is output. If you want to make the output look like source code, you could (almost) do something like this:

I say "almost", because you will need to refine that code a little bit to get the leading zero(s) if the hex value is less than 1000.
abalfazl hossein
Ranch Hand

Joined: Sep 06, 2007
Posts: 606
edited
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: unicode into char
 
Similar Threads
comment
char literal's
how many differnt ways can you initialize char ....
String to char.
Import and keylistener?