This week's giveaway is in the Android forum.
We're giving away four copies of Android Security Essentials Live Lessons and have Godfrey Nolan on-line!
See this thread for details.
The moose likes Programmer Certification (SCJP/OCPJP) and the fly likes Unicode Value???? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » Certification » Programmer Certification (SCJP/OCPJP)
Bookmark "Unicode Value????" Watch "Unicode Value????" New topic
Author

Unicode Value????

rajashree ghatak
Ranch Hand

Joined: Mar 10, 2001
Posts: 151
hi all,
The following code is giving compile time error:
char var='\u000a';
System.out.print("Hi");
System.out.print(var);
System.out.println("All");
'\u000a' is the unicode value for new line according to Pg 26 of Kalid Mughal.The compile error says:"Invalid Character Constant"
but if we assign '\n' to var variable, the program compiles fine and excutes fine.
Similar error occurs when we use '\u005c' in place of '\\' for backslash escape sequence.
Can some1 explain why this error?
thanx in advance,
rajashree.
V Srinivasan
Ranch Hand

Joined: Aug 16, 2000
Posts: 99
we cann't print white space or escape sequence charectors.
rajashree ghatak
Ranch Hand

Joined: Mar 10, 2001
Posts: 151
Srinivasan,
why can't we print escape sequence characters?
aren't '\n','\t','\b','\"','\f' some examples of escape sequence characters?
'\u0020' is unicode value for printing white space.
could some1 throw some light on my previously posted query?
rajashree.

V Srinivasan
Ranch Hand

Joined: Aug 16, 2000
Posts: 99
Hi,
Unicode is for charectors not for system. As you said lets hope somebody lights up.
Thanks & regards,
V.Srinivasan
Cindy Glass
"The Hood"
Sheriff

Joined: Sep 29, 2000
Posts: 8521
The compiler parses your code before attempting to compile it. During that effort all of the unicode stuff is translated into it's corresponding value. The new line got translated and USED to create a new line in your code, all before compiling even started. So what the compiler saw was:
Input-
char var='\u000a';
After parsing-
char var=' //new line used up here and now GONE
';

"JavaRanch, where the deer and the Certified play" - David O'Meara
rajashree ghatak
Ranch Hand

Joined: Mar 10, 2001
Posts: 151
hi Cindy,
Thanx for ur response.But what i fail to understand is why compile time error when we use the unicode values for escape sequence characters with hex digits a,b,c,d,e,f like '\u000a'(new line),'u000c'(form feed),'\u000d'(carriage return) or '\u005c'(backslash)
All of the above give compile error:Invalid character constant.
But '\u0022'(Double Quotes),'\u0008'(Backspace),'\u0009'(Horizontal Tab)
compiles fine and executes.However, with one expection of '\u0027'which is unicode value for Single Quote which again give compile error of Invalid Character Constant.
Kindly comment on this.
rajashree.
Cindy Glass
"The Hood"
Sheriff

Joined: Sep 29, 2000
Posts: 8521
The characters that are part of the language itself are the problem. Backspace and Horizontal Tab are not involved in Java so they are understood to be what they claim that they are. Double Quotes embedded in single quotes becomes '"' which is clear.
A single quote embedded in single quotes ( ''') causes a problem because the character value that you are defining is complete after the second single quote according to the rules of the syntax, but then you have an additional single quote hanging there which does not fit any correct java syntax. The single quotes are PART OF the lexical structure of the definition of a character field.
As a matter of fact ANY character that Java is trying to use to understand your syntax becomes a problem if you are trying to use it as a literal instead of part of the syntax. Somehow you have to tell the compiler which way you intend that character to be used. So the rule is: if you want one of those syntax involved characters to be treated as a literal instead of part of your code, use the provided substitute instead.

From the JLS on the Lexical Structure of the language:
3.10.4 Character Literals

Because Unicode escapes are processed very early, it is not correct to write '\u000a' for a character literal whose value is linefeed (LF); the Unicode escape \u000a is transformed into an actual linefeed in translation step 1 (�3.3) and the linefeed becomes a LineTerminator in step 2 (�3.4), and so the character literal is not valid in step 3. Instead, one should use the escape sequence '\n' (�3.10.6). Similarly, it is not correct to write '\u000d' for a character literal whose value is carriage return (CR). Instead, use '\r'.
V Srinivasan
Ranch Hand

Joined: Aug 16, 2000
Posts: 99
Thanks Cindy,
You have cleared my doubt too. Somewhere I read print() method does't understand unicode charector, is that so, and write() method understands unicode charectors. Could you please give few line on that or where can I get writeup on these issues.
Thanks in advance.
Regards,
V. Srinivasn
rajashree ghatak
Ranch Hand

Joined: Mar 10, 2001
Posts: 151
Thanx Cindy.
u have explained very well and also cleared my query.
rajashree.
Cindy Glass
"The Hood"
Sheriff

Joined: Sep 29, 2000
Posts: 8521
From the API for PrintStream:

All characters printed by a PrintStream are converted into bytes using the platform's default character encoding. The PrintWriter class should be used in situations that require writing characters rather than bytes.

Unicode is 16 bit, bytes are 8 bit.
V Srinivasan
Ranch Hand

Joined: Aug 16, 2000
Posts: 99
Thank you very much Cindy.
Prosenjit Banerjee
Greenhorn

Joined: May 04, 2001
Posts: 20
Thank you very much Cindy Glass and the thread starter rajashree ghatak. And, here you look what I did. It's a new experience.
Following is a java program that consists only some unicode characters (although there are some new lines for the sake of clarity only).
This code compiles and runs without any error.


This actually looks as the following :

I LOVE JAVARANCH.COM
Prosenjit Banerjee
Greenhorn

Joined: May 04, 2001
Posts: 20
Hi everybody,
Comments are expected about my previous post. Please, because I want to match my thinkings with others'.
I LOVE JAVARANCH.COM
Cindy Glass
"The Hood"
Sheriff

Joined: Sep 29, 2000
Posts: 8521
Very cute .
In Just Java 2 Peter va der Linden has some clever code tricks along such lines that you can use to bewilder your co-workers. It's really quite amuzing.
Prosenjit Banerjee
Greenhorn

Joined: May 04, 2001
Posts: 20
Thanks very much Cindy. Thanks again for the reference.
 
 
subject: Unicode Value????
 
Similar Threads
Reg. Unicode escape literals
Literal Line feed
Unicode Characters
// commented lines also checked by compiler?
Illegal character ??