aspose file tools*
The moose likes Programmer Certification (SCJP/OCPJP) and the fly likes Regex Find, Start and Gruop Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Certification » Programmer Certification (SCJP/OCPJP)
Bookmark "Regex Find, Start and Gruop" Watch "Regex Find, Start and Gruop" New topic
Author

Regex Find, Start and Gruop

Joshua Smith
Ranch Hand

Joined: Aug 22, 2005
Posts: 193
All-

I was working through some of Mock questions provided by Kathy Sierra and Bert Bates and had a question about regular expressions. Maybe someone here can clear things up for me.

It's question #7 from the mock questions that they posted to the list.

I'm modified the code slightly so it's clearer as to what output is coming from what method. My version of their code is as follows:



The output is:

0:[]
1:[]
2:[34]
4:[]
5:[]
6:[]


As I understand what's happening, the find() method is walking the String ab34ef from left to right, looking for matches. If it finds one, then it's available via the group() method. If it doesn't find one, then you get a zero-length String from the group method.
For position 0 it finds "a" (which doesn't match so we get "0:[]".
For position 1 it finds "b" (which doesn't match) so we get "1:[]".
For position 2 it finds "34" (a match) and so we get "2:[34]".
Position 3 is gobbled up by the match, so we don't get output for it.
For position 4 it finds "e" (which doesn't match) so we get "4:[]".
For position 5 it finds "f" (which doesn't match) so we get "5:[]".
For position 6 it finds???

That's my question. I'm not sure why we have a 6th find. Is there some sort of implied String terminator ($ in Perl regex speak) that it's finding?

Any ideas?
Thanks,
Josh


Rational Pi Blog - Java, SCJP, Dev Bits- http://rationalpi.wordpress.com
Barry Gaunt
Ranch Hand

Joined: Aug 03, 2002
Posts: 7729
I'll take a guessful stab at this. If you look at the API for Matcher, the group() method, it says:
Note that some patterns, for example a*, match the empty string. This method will return the empty string when the pattern successfully matches the empty string in the input.


So when the find is beginning at character position 6, there is only the empty string left. Because the pattern is "\\d*", this must match (because * means zero or more of the preceding "\\d").

Does that make sense? If so, convince me.


Ask a Meaningful Question and HowToAskQuestionsOnJavaRanch
Getting someone to think and try something out is much more useful than just telling them the answer.
Ryan Kade
Ranch Hand

Joined: Aug 16, 2005
Posts: 69
I think that's exactly right, Barry. The Java tutorial on the topic says:


A zero-length match can occur in a several cases: in an empty input string, at the beginning of an input string, after the last character of an input string, or in between any two characters of an input string.


Convincing?

http://java.sun.com/docs/books/tutorial/extra/regex/quant.html
Bert Bates
author
Sheriff

Joined: Oct 14, 2002
Posts: 8815
    
    5


Spot false dilemmas now, ask me how!
(If you're not on the edge, you're taking up too much room.)
Joshua Smith
Ranch Hand

Joined: Aug 22, 2005
Posts: 193
Clever Barry. :-)

And thanks for the confirmation Ryan and Bert.

A co-worker and I puzzled over that one a bit and were leaning towards a zero-length string, a null string, some sort of invisible terminator etc. It's just nice to see in writing what is actually happening. Helps to "convince me" too. :-)

Josh
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Regex Find, Start and Gruop