wood burning stoves 2.0*
The moose likes Programmer Certification (SCJP/OCPJP) and the fly likes Escape sequences and compiler errors Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Soft Skills this week in the Jobs Discussion forum!
JavaRanch » Java Forums » Certification » Programmer Certification (SCJP/OCPJP)
Bookmark "Escape sequences and compiler errors" Watch "Escape sequences and compiler errors" New topic
Author

Escape sequences and compiler errors

Kezia Matthews
Ranch Hand

Joined: May 19, 2001
Posts: 107
Hi All,
In the following code, why do the lines that are not commented, not produce compiler errors, while the lines that are commented produce compiler errors?

Can anybody help me?
Thanks.
Kezia.
Patrick Mugabe
Ranch Hand

Joined: Jan 08, 2002
Posts: 132
Your code is fine and it compiles with no errors.
Try recompiling.
Kezia Matthews
Ranch Hand

Joined: May 19, 2001
Posts: 107
Hi Patrick,
I think you did not understand my question.
I want to know as to why the lines that are commented in the code gives compilation errors if the comment is removed, where as the lines which are not commented does not give any errors; both being assignment of unicode values to char variable.
Hope you got my question right this time.
Kezia.
bd
Greenhorn

Joined: Jan 09, 2002
Posts: 7
The '\u0027' single-quote literal is invalid because it would evaluate to to: '''
This is obviously unacceptable and is rejected by the compiler. Use '\'' instead.
There is a description of this at www.javasoft.com
BUG ID: 4090696
Kezia Matthews
Ranch Hand

Joined: May 19, 2001
Posts: 107
Why does the backspace unicode value '\u0008' or the linefeed unicode value '\u000c' not generate compiler errors?
Kezia.
[ January 11, 2002: Message edited by: Kezia Matthews ]
Robert Troshynski
Greenhorn

Joined: Jan 11, 2002
Posts: 16
Unicode are special to a Java program in that the
compiler looks through the code for those
sequences and interprets them first.
So, trying an assignment like
char test = '\u000a';
will not work since the compiler interprets
the escape sequence first as a newline such that
the compiler will see your code as
char test = '
';
As you know, you cannot have a line break in a
literal.
To take this further, you can also have a variable
declaration like
char \u0061\u0062\u0063\u0064;
and the compile will interpret it as
char abcd;
[ January 11, 2002: Message edited by: Robert Troshynski ]

Robert Troshynski
Jose Botella
Ranch Hand

Joined: Jul 03, 2001
Posts: 2120
The following will generate a compile error even when commented out
//char c = '\u000a';
//char d = '\u000d';
but they are ok within /* */


SCJP2. Please Indent your code using UBB Code
Kezia Matthews
Ranch Hand

Joined: May 19, 2001
Posts: 107
For
char a = '\u000a';
the complier translates it immediately. So, the above statement turns out to be,
char a = '
';
and since there is a line break in the char literal, the compiler flags an error.
So, it the case with
char b = '\u0027';
which gets translated to
char b = ''';
so the complier flags an error.
But for
char c = '\u0008';
which is the unicode value for backspace, and gets translated to
char c = ';
why doees the compiler not flag any error?
What about
char d = '\u000c';
which is the unicode value for formfeed, why does the compiler not flag an error in the above case also?
Can anybody clear this for me?
Thanks,
Kezia.
Jose Botella
Ranch Hand

Joined: Jul 03, 2001
Posts: 2120

But for
char c = '\u0008';
which is the unicode value for backspace, and gets translated to
char c = ';
why doees the compiler not flag any error?
What about
char d = '\u000c';
which is the unicode value for formfeed, why does the compiler not flag an error in the above case also?

char c = '\u0008' is not translated to the mentioned expression. Simply the compiler doesn't do that.
JLS 3.3 says that all the Unicode escapes will be translated to the corresponding Unicode character.
For instance \u0041 will be tranlated to A. The purpose of this lexical translation is to allow the text editors that doen't support Unicode to produce Unicode characters. Because all the symbols in a Unicode escape are ASCII, and all editors manage ASCII.
JLS 3.4 says that the next lexical translation is to recognize the line terminator characters:
\n (Unix), \r (Mac) and \r\n (Windows)
Doing so the lines of the program are determined.
Because the line terminators are processed at this step they are not allowed to be be part of character or string literals, as it is stated in JLS 3.10.4 y 3.10.4
Also note that in JLS 3.4 only line terminators are recognized. The rest of the Unicode characters are left untouched. And they can appear within string or character literal with the logical exceptions of '
""
\
Kezia Matthews
Ranch Hand

Joined: May 19, 2001
Posts: 107
Jose,
Thanks a ton.
Kezia.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Escape sequences and compiler errors