This week's book giveaways are in the Refactoring and Agile forums.
We're giving away four copies each of Re-engineering Legacy Software and Docker in Action and have the authors on-line!
See this thread and this one for details.
Win a copy of Re-engineering Legacy Software this week in the Refactoring forum
or Docker in Action in the Cloud/Virtualization forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Tokenising using String.split().

 
O. Ziggy
Ranch Hand
Posts: 430
Android Debian VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I wrote the following test which uses "\\d" as the delimeter:



And the output is shown below



I was suprised by the empty tokens. For example why are there emptry tokens at index 0 and 1 but none between 2 and 3 even though there is a number between 2 and 3? Or why there are no empty tokens after token 20?

And also, is there a way to access the "current index" value when using the ehanced for loop? (I had to manually declare int i=0 as shown above).

Thanks
 
Rikesh Desai
Ranch Hand
Posts: 83
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I had read a similar problem somewhere and had tried to understand the reason for the same.
Trying my best to explain to you.
Seniors, please correct if i am wrong.

The blank tokens here are not because the delimiter was found there.
The blank tokens are because, no value was found between the delimiters.

as in the case of first h and j, a digit was found between them, and as digit is the delimiter, the string got split there successfully and gave two tokens h and j.

For the second case: "jh34j"..
a delimiter 3 was found, so the string successfully got split there, so the tokens so formed were jh and 'blank' ('blank' as nothing was found between 3 and 4).
now, when the flow goes ahead, it found another delimiter 4. so the tokens formed here were 'blank' (already formed above and saved as a token) and j.

so for jh34j.. the tokens so formed are jh, 'blank' and j.

Hoping i was able to explain this well.
 
O. Ziggy
Ranch Hand
Posts: 430
Android Debian VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Rikesh. I think you are saying that a null token is created if two delimeters follow each other without any value between them. Im still not sure though why the same logic was not applied for the end of the string (i.e. dddd33333). I would have expected the tokens for this part of the string to be dddd,null,null,null,null.

Thanks
 
ankur trapasiya
Ranch Hand
Posts: 160
Java Netbeans IDE Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
thanks O.Ziggy for posting this question.. actually i also have this doubt it should print as you said dddd null null null null...
 
Rikesh Desai
Ranch Hand
Posts: 83
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
happy that i was able to make my point clear..

for your next question, if you look at the java documentation for the String.split() method, they have mentioned there that "Trailing empty strings are not included in the array".

have a look at the overloaded split(String, int limit) method in javadoc.
that will get things very clear.
they have explained very nicely with an example.
 
O. Ziggy
Ranch Hand
Posts: 430
Android Debian VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ok i've found it. Trailing nulls are not included in the array.

This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.


Thanks for the help.

 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic