• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

string.split() and tokens

 
Rachel Glenn
Ranch Hand
Posts: 95
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have this example:



What is the result?
A. total: 3
B. total: 4
C. total: 7
D. total: 8
E. Compilation fails
F. An exception is thrown at runtime.


While I understand the concept of tokenizing, I am unsure how it works in this specific example. I even ran it in the debugger and am unclear about the output.

\d means the delimeter is a digit. So how does this example work then?? does the split() function see the first digit (1), and record that the first token is 'x'? What does the split() function do when it then sees the second digit (2)?
 
Greg Charles
Sheriff
Posts: 2985
12
Firefox Browser IntelliJ IDE Java Mac Ruby
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It's confusing because it's weird to think of digits as delimiters. Imagine the string was "x,,,, y,, z, a" and you split it on the commas. You'd expect to get eight strings returned, many of which would be empty because there are multiple commas in a row with nothing between them.
 
Rachel Glenn
Ranch Hand
Posts: 95
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Greg Charles wrote:It's confusing because it's weird to think of digits as delimiters. Imagine the string was "x,,,, y,, z, a" and you split it on the commas. You'd expect to get eight strings returned, many of which would be empty because there are multiple commas in a row with nothing between them.


yes it is confusing!

but let me take this a step further.

If the string was "x,y" and I split on commas, I would expect 2 strings to be returned: "x" and "y"

If the string was "x,,y" and I split on commas, then this is where I get confused....it sees the first comma, and marks "x" as the first token. Does it then consider the "x" and first "," as 'consumed'? thus, when it sees the second comma, there is nothing to the left of it to tokenize, so it returns a blank? I am confused here,,,,
 
Greg Charles
Sheriff
Posts: 2985
12
Firefox Browser IntelliJ IDE Java Mac Ruby
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes, if you tell it that one comma is the delimiter, it will take you at your word and return empty strings for two commas in row. That's a good thing. Let's say you had the data:

"FirstName,Nickname,LastName"
"Ralph,'Macho',Camacho"
"Greg,'T-bone',Charles"
"Rachel,,Glenn"

You'd want your first and last names parsed out correctly even though you don't have a nickname.

If you really want to split the string on one or more commas, you just need to change the regular expression in the split() to string.split(",+"). In that case, the first three strings above get split into three pieces, but the last one only gets split into two.
 
Rachel Glenn
Ranch Hand
Posts: 95
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Greg Charles wrote:Yes, if you tell it that one comma is the delimiter, it will take you at your word and return empty strings for two commas in row. That's a good thing. Let's say you had the data:

"FirstName,Nickname,LastName"
"Ralph,'Macho',Camacho"
"Greg,'T-bone',Charles"
"Rachel,,Glenn"

You'd want your first and last names parsed out correctly even though you don't have a nickname.

If you really want to split the string on one or more commas, you just need to change the regular expression in the split() to string.split(",+"). In that case, the first three strings above get split into three pieces, but the last one only gets split into two.


thank you! makes sense now!
 
Henry Wong
author
Marshal
Pie
Posts: 21190
80
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
 
Rachel Glenn
Ranch Hand
Posts: 95
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Henry Wong wrote:Please QuoteYourSources


my source is the Oracle mock exam
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic