aspose file tools*
The moose likes Beginning Java and the fly likes how compiler determines that String literal's length ?? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "how compiler determines that String literal Watch "how compiler determines that String literal New topic
Author

how compiler determines that String literal's length ??

Bharat Makwana
Ranch Hand

Joined: May 21, 2007
Posts: 107
Hello Everyone,

Please can somebody tell me how compiler determines that String literal's length ??

String firstName = "Mac";

I have serched String class,but could not find any thing useful related to it.

(I have read Article given by Corey McGlone at http://www.javaranch.com/journal/200409/Journal200409.jsp#a1)


ॐ सर्वे जना: सुखिनो भवन्तु , तथास्तु |
'May the whole world be happy, so be it'

SCJP1.5, SCWCD1.5
Christophe Verré
Sheriff

Joined: Nov 24, 2005
Posts: 14688
    
  16

firstName.length() ?


[My Blog]
All roads lead to JavaRanch
Bharat Makwana
Ranch Hand

Joined: May 21, 2007
Posts: 107
Originally posted by Satou kurinosuke:
firstName.length() ?


No man, I am not talking about that method. I meant, When compiler find a String literal how compiler process it? How it determines it's length? What logic it uses to process String literal ?? How it creates String Object from "Mac"?? and I think it must have knowlegde to find out when Mac ends? I want to know that logic?

If I asked you to determine length of firstName without using any String class methods,how will you proceed ??

I hope you got my question ?
David McCombs
Ranch Hand

Joined: Oct 17, 2006
Posts: 212
A String literal is a String.

http://java.sun.com/docs/books/jls/third_edition/html/lexical.html#3.10.5

If you have trouble finding the answer to a Java question always try the Java Language Specification
[ June 22, 2007: Message edited by: David McCombs ]

"Should array indices start at 0 or 1? My compromise of 0.5 was rejected without, I thought, proper consideration."- Stan Kelly-Bootle
Christophe Verré
Sheriff

Joined: Nov 24, 2005
Posts: 14688
    
  16

Sorry Bharat, I had a doubt that it was not what you were not asking for
Prosenjit Banerjee
Ranch Hand

Joined: Dec 18, 2002
Posts: 102
Java or any compiler scans each character in the source file. When it encounters a double quote character it flags the the start of a string and when it is another double quote character it encounters the end of the string.
Carefully read and run the following code for an example:


It produces the following output:


Always say the TRUTH only
Bharat Makwana
Ranch Hand

Joined: May 21, 2007
Posts: 107
Thanks Prosenjit,

When I was studding Theory Of automation and regular expression and also Compiler writing..I was wondering they are teaching all this subject instead they should teach some more languages...But they were right ....
Devesh H Rao
Ranch Hand

Joined: Feb 09, 2002
Posts: 687

Originally posted by Bharat Makwana:



If I asked you to determine length of firstName without using any String class methods,how will you proceed ??

I hope you got my question ?


store it in a char[] and the length of the array is the length which you need.

Note: String class stores the characters in a char [] as well.
fred rosenberger
lowercase baba
Bartender

Joined: Oct 02, 2003
Posts: 11406
    
  16

Originally posted by Devesh H Rao:

store it in a char[] and the length of the array is the length which you need.

Note: String class stores the characters in a char [] as well.


yes, but the original post was asking how the compiler knows what size char[] to create to begin with.


There are only two hard things in computer science: cache invalidation, naming things, and off-by-one errors
Bharat Makwana
Ranch Hand

Joined: May 21, 2007
Posts: 107
Fred is right !!! Thank God.

That's my question. I know I can use toCharArray() method of
String class and can get char array and then I can apply length field.

But I want to know How Compiler evaluates following statement?

String firstName = "Mac"; ------> Line #1

May be something like this(I am assuming, there is no syntax error),
(1) Parser reads first token "String" from Line #1 and determines is it object or primitive? (I do not know how?). If it's object then determine it's type(here it's String).
(2) And,next token is, it's reference(firstName), then reads "=" and then it read's " (double quote),so it assume it's String literal(there is no "new").

(3)then it perform some operations to find out it's content,length store it in String literal pool(I am confused in this step)
[ June 22, 2007: Message edited by: Bharat Makwana ]
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
Ignoring all the other things a new line of code could be... When the compiler finds "String" it checks to see if the token is a primitive type. There are only a few so that's an easy check against a hard coded list. Then it checks to see if it is a class that has been imported. java.lang is always imported so String is found. Then it looks in the classpath to find the class and learn more about it.

The token "Mac" gets special handling because it's in quotes. I'm sure the compiler has some internal represenation for this, not the normal String class that we always use. It has to track things like whether this sequence of characters is in the literal pool yet and where. I'd expect these internals to change over time as the compiler team comes up with smarter algorithms, clever refactorings and even language changes. That's one reason I'm happy to not know much about them. Encapsulation in action.

I'm definitely guessing at the compiler from the outside. Compilers and interpreters are fascinating science, but way beyond what I need to know. A quick Google shows compiler construction courses in CS300 and CS500 levels. Maybe one of those would be fun for you?


A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
fred rosenberger
lowercase baba
Bartender

Joined: Oct 02, 2003
Posts: 11406
    
  16

remember that the compiler is really just a program. If i recall correctly, the javac compiler is written in Java.

So, it's very possible that when the compiler reads the file, it's just using something like a StringTokenizer to break each line into the various tokens, puts each into a String array, and uses the length of each element.

Whatever language the compiler is written in has some built-in tools for determining the length of a string.
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24187
    
  34

Some of what you missed in compiler class...

Most compilers use multiple "stages": a lexer finds the individual tokens, a parser decides whether the stream of tokens form valid syntax, and a code generator emits the machine code. Modern languages are designed to make this separation possible. Note how virtually all modern languages ignore all whitespace; this makes the separation between the lexer and parser nice and clean.

For a line like

String s = "Mac";

the lexer would return a list of tokens like (conceptually)

{the symbol 'String' at offset 0}
{the symbol 's' at offset 8}
{the operator '=' ...}
{the String literal "Mac" ...}
{a semicolon}

Each token is an object in the language in which the compiler is written; it can hold all kinds of useful information. Those "offset" values can be used later if there are any error messages to report.

Next the parser would look at the first token and say "Hmm, a symbol that's not a keyword. This could be an assignment statement, a variable declaration, or a method call". Then it looks at the next token and says "another symbol. OK, this has to be a variable declaration for a variable named 's'. It could be a declaration with or without an initializer." Then is looks at the third token and says "ah, an equals sign: this is a variable declaration with an initializer." Then it sees the String literal and says "OK, we're assigning this literal value to the new variable." When it sees the semicolon, it knows the statement is over, and this is a syntactically valid code.

At some later stage, the parser looks at this assignment statement again and asks whether 'String' is a known class, and whether a String literal is a valid initializer for it. If all it well, the statement eventually gets passed to the code generator, which puts the appropriate bytecodes into the class file being generated.


[Jess in Action][AskingGoodQuestions]
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: how compiler determines that String literal's length ??