wood burning stoves 2.0*
The moose likes Programmer Certification (SCJP/OCPJP) and the fly likes Tokenising using String.split(). Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Certification » Programmer Certification (SCJP/OCPJP)
Bookmark "Tokenising using String.split(). " Watch "Tokenising using String.split(). " New topic
Author

Tokenising using String.split().

O. Ziggy
Ranch Hand

Joined: Oct 02, 2005
Posts: 430

I wrote the following test which uses "\\d" as the delimeter:



And the output is shown below



I was suprised by the empty tokens. For example why are there emptry tokens at index 0 and 1 but none between 2 and 3 even though there is a number between 2 and 3? Or why there are no empty tokens after token 20?

And also, is there a way to access the "current index" value when using the ehanced for loop? (I had to manually declare int i=0 as shown above).

Thanks
Rikesh Desai
Ranch Hand

Joined: Jun 02, 2010
Posts: 83
I had read a similar problem somewhere and had tried to understand the reason for the same.
Trying my best to explain to you.
Seniors, please correct if i am wrong.

The blank tokens here are not because the delimiter was found there.
The blank tokens are because, no value was found between the delimiters.

as in the case of first h and j, a digit was found between them, and as digit is the delimiter, the string got split there successfully and gave two tokens h and j.

For the second case: "jh34j"..
a delimiter 3 was found, so the string successfully got split there, so the tokens so formed were jh and 'blank' ('blank' as nothing was found between 3 and 4).
now, when the flow goes ahead, it found another delimiter 4. so the tokens formed here were 'blank' (already formed above and saved as a token) and j.

so for jh34j.. the tokens so formed are jh, 'blank' and j.

Hoping i was able to explain this well.


OCPJP 95%
O. Ziggy
Ranch Hand

Joined: Oct 02, 2005
Posts: 430

Thanks Rikesh. I think you are saying that a null token is created if two delimeters follow each other without any value between them. Im still not sure though why the same logic was not applied for the end of the string (i.e. dddd33333). I would have expected the tokens for this part of the string to be dddd,null,null,null,null.

Thanks
ankur trapasiya
Ranch Hand

Joined: Sep 24, 2010
Posts: 160

thanks O.Ziggy for posting this question.. actually i also have this doubt it should print as you said dddd null null null null...


OCPJP(83%)
Rikesh Desai
Ranch Hand

Joined: Jun 02, 2010
Posts: 83
happy that i was able to make my point clear..

for your next question, if you look at the java documentation for the String.split() method, they have mentioned there that "Trailing empty strings are not included in the array".

have a look at the overloaded split(String, int limit) method in javadoc.
that will get things very clear.
they have explained very nicely with an example.
O. Ziggy
Ranch Hand

Joined: Oct 02, 2005
Posts: 430

Ok i've found it. Trailing nulls are not included in the array.

This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.


Thanks for the help.

 
 
subject: Tokenising using String.split().