File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Beginning Java and the fly likes regex doubt Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "regex doubt" Watch "regex doubt" New topic

regex doubt

Bruno Sant Ana

Joined: May 17, 2012
Posts: 29
hey guys,

My boubt is related to this code:

The output is:
false true b

but if I uncomment the line m.reset(); the output change to:
false true aaaaaaab

I didn't understand very well why the output change if I reset the Matcher. It seems that when the line is commented out the text is evaluated from the letter "b" leaving aside the letters "a", right? Why?

In the documentation it says that the find method starts to evaluate the text from the beginning onwards or from the first character that didn't match anything before if the find method was executed before sucessfully:
"This method starts at the beginning of the input sequence or, if a previous invocation of the method was successful and the matcher has not since been reset, at the first character not matched by the previous match."

I don't know if this documentation's part have something to do with my doubt or if it helps you to explain for me the different outputs.

Winston Gutkowski

Joined: Mar 17, 2011
Posts: 8927

Bruno Sant Ana wrote:but if I uncomment the line m.reset(); the output change to:
false true aaaaaaab

Interesting question. It has to do with how regex (or rather Matcher) "positions" when attempting a match.

I suspect (though I'm not absolutely sure about this) that the matches() positions to the point after it determined that there was no match, which I suspect is after any number of 'a's NOT followed by "b$", which would be the first character following all those 'a's. And, since your pattern is "a*b", not "a+b", the 'b' matches the pattern.

Like I say, I understand the result, but you might want a second opinion of my interpretation.


PS: If you understand regexes, then you need to also understand that matches("a*b") is the same as find("^a*b$") (well, almost).

Bats fly at night, 'cause they aren't we. And if we tried, we'd hit a tree -- Ogden Nash (or should've been).
Articles by Winston can be found here
I agree. Here's the link:
subject: regex doubt
jQuery in Action, 3rd edition