• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

string.split() and tokens

 
Ranch Hand
Posts: 95
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have this example:



What is the result?
A. total: 3
B. total: 4
C. total: 7
D. total: 8
E. Compilation fails
F. An exception is thrown at runtime.


While I understand the concept of tokenizing, I am unsure how it works in this specific example. I even ran it in the debugger and am unclear about the output.

\d means the delimeter is a digit. So how does this example work then?? does the split() function see the first digit (1), and record that the first token is 'x'? What does the split() function do when it then sees the second digit (2)?
 
Sheriff
Posts: 3063
12
Mac IntelliJ IDE Python VI Editor Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It's confusing because it's weird to think of digits as delimiters. Imagine the string was "x,,,, y,, z, a" and you split it on the commas. You'd expect to get eight strings returned, many of which would be empty because there are multiple commas in a row with nothing between them.
 
Rachel Glenn
Ranch Hand
Posts: 95
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Greg Charles wrote:It's confusing because it's weird to think of digits as delimiters. Imagine the string was "x,,,, y,, z, a" and you split it on the commas. You'd expect to get eight strings returned, many of which would be empty because there are multiple commas in a row with nothing between them.



yes it is confusing!

but let me take this a step further.

If the string was "x,y" and I split on commas, I would expect 2 strings to be returned: "x" and "y"

If the string was "x,,y" and I split on commas, then this is where I get confused....it sees the first comma, and marks "x" as the first token. Does it then consider the "x" and first "," as 'consumed'? thus, when it sees the second comma, there is nothing to the left of it to tokenize, so it returns a blank? I am confused here,,,,
 
Greg Charles
Sheriff
Posts: 3063
12
Mac IntelliJ IDE Python VI Editor Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes, if you tell it that one comma is the delimiter, it will take you at your word and return empty strings for two commas in row. That's a good thing. Let's say you had the data:

"FirstName,Nickname,LastName"
"Ralph,'Macho',Camacho"
"Greg,'T-bone',Charles"
"Rachel,,Glenn"

You'd want your first and last names parsed out correctly even though you don't have a nickname.

If you really want to split the string on one or more commas, you just need to change the regular expression in the split() to string.split(",+"). In that case, the first three strings above get split into three pieces, but the last one only gets split into two.
 
Rachel Glenn
Ranch Hand
Posts: 95
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Greg Charles wrote:Yes, if you tell it that one comma is the delimiter, it will take you at your word and return empty strings for two commas in row. That's a good thing. Let's say you had the data:

"FirstName,Nickname,LastName"
"Ralph,'Macho',Camacho"
"Greg,'T-bone',Charles"
"Rachel,,Glenn"

You'd want your first and last names parsed out correctly even though you don't have a nickname.

If you really want to split the string on one or more commas, you just need to change the regular expression in the split() to string.split(",+"). In that case, the first three strings above get split into three pieces, but the last one only gets split into two.



thank you! makes sense now!
 
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Please QuoteYourSources
 
Rachel Glenn
Ranch Hand
Posts: 95
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Henry Wong wrote:Please QuoteYourSources



my source is the Oracle mock exam
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic