This week's book giveaway is in the General Computing forum.
We're giving away four copies of Arduino in Action and have Martin Evans, Joshua Noble, and Jordan Hochenbaum on-line!
See this thread for details.
The moose likes Java in General and the fly likes Information on string object internals Big Moose Saloon
  Search | Java FAQ | Recent Topics
Register / Login


JavaRanch » Java Forums » Java » Java in General
Reply Bookmark "Information on string object internals" Watch "Information on string object internals" New topic
Author

Information on string object internals

Vershana Amarula
Greenhorn

Joined: Jul 16, 2007
Posts: 4
I don't have a problem, but I am very curious to understand this aspect of Java internals that I encountered recently. It may be a newb question, but I wasn't able to find an appropriate Google keyword to get an answer.

In the debugger I observed that the string variable "name" is equal to "/css/cssimages/circlexx.png".

When executed, the debugger shows the integer variable "test" to be equal to zero, meaning that the substring, as expected, starts at the first byte of the string variable "name".

IntelliJ doesn't explicitly indicate that line 7 fires, but line number 14 executes as expected.


When I look at the string structure in the debugger (IntelliJ and Eclipse) I see, roughly speaking, the following for string variable "name":


Since test is zero, that says that the target string was found at the beginning of "name". "Count" is 27, which corresponds to the length of the string "name" as displayed above in the first line of the string object shown.

While debugging, I opened the 58 byte char array within the string object.

My expectation was to see my search string and some empty or null trailing bytes (zero from "indexOf" after all means the substring starts at the beginning of the string).

I was surprised to find that the full 58 bytes are occupied by a long string, to the effect, apparently a uri, "/XXX/yyy/abc/css/cssimages/circlexx.png" such that the total string, ending in "png" is 58 bytes long.

It was then I noticed the object element "offset = 31". Sure enough, the substring /css/cssimages/... begins exactly at byte 31. It would seem that Java is treating the contents of the string as beginning at byte 31.

Clearly, the debugger is content that for comparison purposes, the actual string value to use begins at byte 31, and the string compare seems to support that when it reports zero and the code executes.

But why is there extra "stuff" in the string that needs to be ignored? I figured it was something to do with immutability and performance, but now that I'm aware of it, I'm really curious why it happens.

Thanks!

[ January 11, 2008: Message edited by: Vershana Amarula ]
[ January 11, 2008: Message edited by: Vershana Amarula ]
Joe Ess
Bartender

Joined: Oct 29, 2001
Posts: 8290

Originally posted by Vershana Amarula:

I was surprised to find that the full 58 bytes are occupied by a long string, to the effect, apparently a uri, "/XXX/yyy/abc/css/cssimages/circlexx.png" such that the total string, ending in "png" is 58 bytes long.


If you look in your JDK directory, you should find a src.zip file. Unzip it and look at the code for String yourself.
If one invokes substring on a String instance, the new substring instance references the original String's char array, using an offset and length within that data to identify the substring.
This saves valuable memory over replicating the same string data over and over.


"blabbing like a narcissistic fool with a superiority complex" ~ N.A.
[How To Ask Questions On JavaRanch]
Bill Shirley
Ranch Hand

Joined: Nov 08, 2007
Posts: 457
As said above, if you point at a previously terminated buffer, string will reuse it, pointing into it at a latter point.

ON A DIFFERENT NOTE:
The boolean OR operator is || and you used the bitwise OR operator |.
You should use the one you logically intend.
They happen to be the same in this instance, but that's happenstance.


Bill Shirley - bshirley - frazerbilt.com
if (Posts < 30) you.read( JavaRanchFAQ);
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24081
    
  15

Originally posted by Bill Shirley:

They happen to be the same in this instance, but that's happenstance.


They give the same result, but the || version would stop doing comparisons after the first "hit", and so do a lot less computation. In a loop, this could make a lot of difference.


[Jess in Action][AskingGoodQuestions]
 
I agree. Here's the link: http://ej-technologies/jprofiler - if it wasn't for jprofiler, we would need to run our stuff on 16 servers instead of 3.
 
subject: Information on string object internals
 
Similar Threads
Java endian and bit order and signed and unsigned and python and more confusion than I can deal with
DB file reading problem
Data File Format & Schema File
Static variable
Basic Authentication Servlet Redirect