File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Programmer Certification (SCJP/OCPJP) and the fly likes string.split() and tokens Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Certification » Programmer Certification (SCJP/OCPJP)
Bookmark "string.split() and tokens" Watch "string.split() and tokens" New topic
Author

string.split() and tokens

Rachel Glenn
Ranch Hand

Joined: Oct 24, 2012
Posts: 95
I have this example:



What is the result?
A. total: 3
B. total: 4
C. total: 7
D. total: 8
E. Compilation fails
F. An exception is thrown at runtime.


While I understand the concept of tokenizing, I am unsure how it works in this specific example. I even ran it in the debugger and am unclear about the output.

\d means the delimeter is a digit. So how does this example work then?? does the split() function see the first digit (1), and record that the first token is 'x'? What does the split() function do when it then sees the second digit (2)?
Greg Charles
Sheriff

Joined: Oct 01, 2001
Posts: 2861
    
  11

It's confusing because it's weird to think of digits as delimiters. Imagine the string was "x,,,, y,, z, a" and you split it on the commas. You'd expect to get eight strings returned, many of which would be empty because there are multiple commas in a row with nothing between them.
Rachel Glenn
Ranch Hand

Joined: Oct 24, 2012
Posts: 95
Greg Charles wrote:It's confusing because it's weird to think of digits as delimiters. Imagine the string was "x,,,, y,, z, a" and you split it on the commas. You'd expect to get eight strings returned, many of which would be empty because there are multiple commas in a row with nothing between them.


yes it is confusing!

but let me take this a step further.

If the string was "x,y" and I split on commas, I would expect 2 strings to be returned: "x" and "y"

If the string was "x,,y" and I split on commas, then this is where I get confused....it sees the first comma, and marks "x" as the first token. Does it then consider the "x" and first "," as 'consumed'? thus, when it sees the second comma, there is nothing to the left of it to tokenize, so it returns a blank? I am confused here,,,,
Greg Charles
Sheriff

Joined: Oct 01, 2001
Posts: 2861
    
  11

Yes, if you tell it that one comma is the delimiter, it will take you at your word and return empty strings for two commas in row. That's a good thing. Let's say you had the data:

"FirstName,Nickname,LastName"
"Ralph,'Macho',Camacho"
"Greg,'T-bone',Charles"
"Rachel,,Glenn"

You'd want your first and last names parsed out correctly even though you don't have a nickname.

If you really want to split the string on one or more commas, you just need to change the regular expression in the split() to string.split(",+"). In that case, the first three strings above get split into three pieces, but the last one only gets split into two.
Rachel Glenn
Ranch Hand

Joined: Oct 24, 2012
Posts: 95
Greg Charles wrote:Yes, if you tell it that one comma is the delimiter, it will take you at your word and return empty strings for two commas in row. That's a good thing. Let's say you had the data:

"FirstName,Nickname,LastName"
"Ralph,'Macho',Camacho"
"Greg,'T-bone',Charles"
"Rachel,,Glenn"

You'd want your first and last names parsed out correctly even though you don't have a nickname.

If you really want to split the string on one or more commas, you just need to change the regular expression in the split() to string.split(",+"). In that case, the first three strings above get split into three pieces, but the last one only gets split into two.


thank you! makes sense now!
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18997
    
  40

Please QuoteYourSources


Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Rachel Glenn
Ranch Hand

Joined: Oct 24, 2012
Posts: 95
Henry Wong wrote:Please QuoteYourSources


my source is the Oracle mock exam
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: string.split() and tokens