• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Paul Clapham
  • Ron McLeod
  • Jeanne Boyarsky
  • Tim Cooke
Sheriffs:
  • Liutauras Vilda
  • paul wheaton
  • Henry Wong
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
  • Frits Walraven
Bartenders:
  • Piet Souris
  • Himai Minh

Tokenizing with regex pattern. Little confused!

 
Ranch Hand
Posts: 65
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Im using a regex pattern to tokenize a String.
The code runs fine but Im curious about the output.
Here's the code:
My code prints brackets around the output to allow for whitespaces.
Here is my command line invocation where args[0] is the regex pattern to be used and args[1] is the source String:
java Test2 "\d*" "cY 39r k"
The output was:
Token: ><
Token: >c<
Token: >Y<
Token: > <
Token: ><
Token: >r<
Token: > <
Token: >k<

Am I right in saying, that at cell 0, a 'c' resides, which is a delimiter as it is not a digit so an empty String >< is printed. Cell 1 contains 'Y' which is a delimiter as it is not a digit, so >c< is printed. Then in cell 2 a whitespace resides, which is not a digit, so it therefore counts as a delimiter. but why isn't >cY< printed? Here it prints a whitespace > < which is the delimiter. I would have thought >cY< would be printed.
I read the Java tutorial on searching using Regex and if it was a search I can understand that (off the top of my head) the output would be:
"" @ start index 0 and end index 0
"" @ start index 1 end index 1
"" @ start 2 end 2
39 @ start 3 end 5
"" @ start 5 end 5
"" @ start 6 end 6
"" @ start 7 end 7
"" @ start 8 end 8

I just dont understand what's going on when using the above regex expression as a delimiter when tokenizing.
Please help!
Thank you
[ June 24, 2008: Message edited by: Keith Nagle ]
 
author
Posts: 23919
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Your regex pattern for the delimiter is zero or more digits. This means that an empty string (zero digits) is a valid delimiter.

Am I right in saying, that at cell 0, a 'c' resides, which is a delimiter as it is not a digit so an empty String >< is printed. Cell 1 contains 'Y' which is a delimiter as it is not a digit, so >c< is printed. Then in cell 2 a whitespace resides, which is not a digit, so it therefore counts as a delimiter. but why isn't >cY< printed? Here it prints a whitespace > < which is the delimiter. I would have thought >cY< would be printed.
I read the Java tutorial on searching using Regex and if it was a search I can understand that (off the top of my head) the output would be:



Basically, you have an empty string delimiter before the first character, which is why the first value is an empty string. You have an empty string delimiter between the first and second character, which is why the second value is a "c" -- the value between the first and second delimiters. You have an empty string delimiter between the second and third character, which is why the second value is a "Y" -- the value between the second and third delimiters.

The values are between the delimiters -- they are not indpendent of each other.

Henry
[ June 24, 2008: Message edited by: Henry Wong ]
 
Bartender
Posts: 5167
11
Netbeans IDE Opera Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Keith, it's rather rude of you not to tell us here that this question has already been answered on the Sun Java forum 16 hours ago.

Confused about Tokenizing with Regex
 
Marshal
Posts: 75669
354
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Darryl Burke:
This question has already been answered on the Sun Java forum 16 hours ago.

Read this FAQ, please.
 
Darryl Burke
Bartender
Posts: 5167
11
Netbeans IDE Opera Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Umm, till I clicked the link I thought that was directed at me :roll:
 
Campbell Ritchie
Marshal
Posts: 75669
354
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Darryl Burke:
Umm, till I clicked the link I thought that was directed at me :roll:

Sorry.
 
If you were a tree, what sort of tree would you be? This tiny ad is a poop beast.
Free, earth friendly heat - from the CodeRanch trailboss
https://www.kickstarter.com/projects/paulwheaton/free-heat
reply
    Bookmark Topic Watch Topic
  • New Topic