aspose file tools*
The moose likes Beginning Java and the fly likes regex doubt Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "regex doubt" Watch "regex doubt" New topic
Author

regex doubt

Bruno Sant Ana
Greenhorn

Joined: May 17, 2012
Posts: 27
hey guys,

My boubt is related to this code:



The output is:
false true b

but if I uncomment the line m.reset(); the output change to:
false true aaaaaaab

I didn't understand very well why the output change if I reset the Matcher. It seems that when the line is commented out the text is evaluated from the letter "b" leaving aside the letters "a", right? Why?

In the documentation it says that the find method starts to evaluate the text from the beginning onwards or from the first character that didn't match anything before if the find method was executed before sucessfully:
http://docs.oracle.com/javase/1.4.2/docs/api/java/util/regex/Matcher.html#find()
"This method starts at the beginning of the input sequence or, if a previous invocation of the method was successful and the matcher has not since been reset, at the first character not matched by the previous match."

I don't know if this documentation's part have something to do with my doubt or if it helps you to explain for me the different outputs.

Thanks
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7639
    
  19

Bruno Sant Ana wrote:but if I uncomment the line m.reset(); the output change to:
false true aaaaaaab

Interesting question. It has to do with how regex (or rather Matcher) "positions" when attempting a match.

I suspect (though I'm not absolutely sure about this) that the matches() positions to the point after it determined that there was no match, which I suspect is after any number of 'a's NOT followed by "b$", which would be the first character following all those 'a's. And, since your pattern is "a*b", not "a+b", the 'b' matches the pattern.

Like I say, I understand the result, but you might want a second opinion of my interpretation.

Winston

PS: If you understand regexes, then you need to also understand that matches("a*b") is the same as find("^a*b$") (well, almost).


Isn't it funny how there's always time and money enough to do it WRONG?
Articles by Winston can be found here
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: regex doubt