File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Programmer Certification (SCJP/OCPJP) and the fly likes unicode escapes Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login

Win a copy of Java Interview Guide this week in the Jobs Discussion forum!
JavaRanch » Java Forums » Certification » Programmer Certification (SCJP/OCPJP)
Bookmark "unicode escapes" Watch "unicode escapes" New topic

unicode escapes

Abu Yoosuf
Ranch Hand

Joined: Nov 14, 2002
Posts: 33

JLS 3.3
The character produced by a Unicode escape does not participate in further
Unicode escapes. For example, the raw input \u005cu005a results in the six char-acters
\ u 0 0 5 a, because 005c is the Unicode value for \.

I tried to test the following code snippet, but fails with "illegal escape character".

What am I missing? Would appreciate any clarifications. Thanks.
James Chegwidden
Ranch Hand

Joined: Oct 06, 2002
Posts: 201
You have to many characters in the sequence. That's whay it has an error.

Author and Instructor, my book
Abu Yoosuf
Ranch Hand

Joined: Nov 14, 2002
Posts: 33
Are you disagreeing with what is in jls 3.3?
I tried few few other variations.
1. String s = "\u005c\u005a"; //doesn't compile
2. String ss = "\\u005c\u005a"; //output is \u005cZ
3. String sss = "\u005c\\u005a"; //output is \Z
I can understand the output for 2 & 3 but not for 1. If \u005c translates to \, I expected s to be assigned to "\\u005a" by the end of the translation.
Jose Botella
Ranch Hand

Joined: Jul 03, 2001
Posts: 2120
This is from JLS 3.10.6
(within an string literal..)

...It is a compile-time error if the character following a backslash in an escape is not
an ASCII b, t, n, f, r, ", ', \, 0, 1, 2, 3, 4, 5, 6, or 7. The Unicode escape \u is processed earlier

The following string is ok:
"\u005ctThis would be printed at a tab distance"
this string is equivalent to "\tThis would be..."
because \u005c has been translated to a backslash
However being \u0009 the Unicode for \t ,
will cause a compiler error, because \u005c has been translated to "\" and you cannot have "\u" within a string literal.

SCJP2. Please Indent your code using UBB Code
Abu Yoosuf
Ranch Hand

Joined: Nov 14, 2002
Posts: 33
I agree. Here's the link:
subject: unicode escapes
It's not a secret anymore!