• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

valid Identifier

 
Binu K Idicula
Ranch Hand
Posts: 99
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
class NameTest{
static int $$ = 1;
static int \u00aa = 2;
static int \u0000 = 1;
public static void main(String args[]){
System.out.println(NameTest.\u00aa);
}
}
It causes \u0000 is invalid identifier to be printed when compiled. But can we correctly say which unicodes are valid? Is it limitted to the charecter set support ??
I suppose those unicodes which are not defined as characters are illegal to start with .. is It ?
 
Binu K Idicula
Ranch Hand
Posts: 99
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Not reply .. but another doubt.
How can we specify an identifier with unicode sequence ..
like
int \u00aa\u0000\uafff = 1; (Is it ??)
Is there any other restriction other than "should not start with a digit and allowed characters"
when I tried with
int \u00aa\u0000 = 1; it din't allow
can u tell me exact rule ?
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Unicode escape sequences are actually translated prior to compilation. Therefore, if you have the following code:

A translation of the Unicode sequences is done prior to compilation. Therefore, this compiles just as if you had written this:

You can find more about this translation process in the JLS, §3.3 Unicode Escapes
So, with that in mind, the rules for what you can use to define variable names has not changed. The variable name must start with a "JavaLetter." From the JLS, a JavaLetter is:

A "Java letter" is a character for which the method Character.isJavaIdentifierStart returns true. A "Java letter-or-digit" is a character for which the method Character.isJavaIdentifierPart returns true.
The Java letters include uppercase and lowercase ASCII Latin letters A-Z (\u0041-\u005a), and a-z (\u0061-\u007a), and, for historical reasons, the ASCII underscore (_, or \u005f) and dollar sign ($, or \u0024). The $ character should be used only in mechanically generated source code or, rarely, to access preexisting names on legacy systems.

Therefore, you can use any unicode sequences that correspond to the JavaLetters to begin a variable name. Notice that \u0000 does not correspond to a JavaLetter and, this, can not be used to begin a variable name.
Throughout the rest of a variable name, however, letters or digits can be used so you can also use the unicode escape sequences that pertain to digits for all characters of a variable name except for the first character.
Be sure to check out the JLS, §3.8 Identifiers for a more detailed explanation.
I hope that helps,
Corey
 
Binu K Idicula
Ranch Hand
Posts: 99
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi there,
Thanks for the reply , but as you said,
---------------
A-Z (\u0041-\u005a), and a-z (\u0061-\u007a), and, for historical reasons, the ASCII underscore (_, or \u005f) and dollar sign ($, or \u0024).
-----------------
are the valid to start with unicode chars is it ?
but then how come an identifier can start with
\u00aa ?
I tried this also
System.out.println(Character.isJavaIdentifierStart(\u00aa));
But got compilation error stating int parameter not accepted for the function..
expecting your reply ..
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Don't forget that the translation of Unicode Escape Sequences is done prior to compilation! Therefore, your line of code:

translates to this:

Do you have a variable named ª? I didn't think so. However, if you put tick marks around the escape sequence, you'll get the character ª, not the variable ª. Try this:

and I think you'll see what you're interested in.
Also, notice that the JavaLetters are not confined only to the English letters of the alphabet. You are free to begin variables with characters outside of that set, but only if it considered a JavaLetter, as detailed above.
Corey
[ July 23, 2002: Message edited by: Corey McGlone ]
 
Jon Dornback
Ranch Hand
Posts: 137
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
while we're talking about unicode, is there a method that gives the unicode sequence for a character and vice versa? i looked in String class but didn't see anything.
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Jon Dornback:
while we're talking about unicode, is there a method that gives the unicode sequence for a character and vice versa? i looked in String class but didn't see anything.

Unfortunately, I don't know of a way to do this. Perhaps someone else knows. However, you can see all of the Unicode characters at Unicode.org. Check out the Charts section.
Corey
 
Francisco A Guimaraes
Ranch Hand
Posts: 182
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
As a matter of fact, the $ is not the only valid for the start of a identifier. Any currency sign(�,�,�) can be used.
Francisco
 
Chung Huang
Ranch Hand
Posts: 56
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The error was on \u0000 right? I thought that \u0000 is null wasn't it? I mean a char variable declared as class member that was not initialized would get \u0000 assigned. I can't remember but wasn't it means null or space or something that means "nothing" If that is the case then it can't be used as identifier.
 
Binu K Idicula
Ranch Hand
Posts: 99
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi ,
Thanks for the detailed reply .. I got the idea of Java letter and unicode sequences.
Thank you
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic