Win a copy of The Java Performance Companion this week in the Performance forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

regex doubt

 
Bruno Sant Ana
Greenhorn
Posts: 29
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hey guys,

My boubt is related to this code:



The output is:
false true b

but if I uncomment the line m.reset(); the output change to:
false true aaaaaaab

I didn't understand very well why the output change if I reset the Matcher. It seems that when the line is commented out the text is evaluated from the letter "b" leaving aside the letters "a", right? Why?

In the documentation it says that the find method starts to evaluate the text from the beginning onwards or from the first character that didn't match anything before if the find method was executed before sucessfully:
http://docs.oracle.com/javase/1.4.2/docs/api/java/util/regex/Matcher.html#find()
"This method starts at the beginning of the input sequence or, if a previous invocation of the method was successful and the matcher has not since been reset, at the first character not matched by the previous match."

I don't know if this documentation's part have something to do with my doubt or if it helps you to explain for me the different outputs.

Thanks
 
Winston Gutkowski
Bartender
Pie
Posts: 10422
63
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Bruno Sant Ana wrote:but if I uncomment the line m.reset(); the output change to:
false true aaaaaaab

Interesting question. It has to do with how regex (or rather Matcher) "positions" when attempting a match.

I suspect (though I'm not absolutely sure about this) that the matches() positions to the point after it determined that there was no match, which I suspect is after any number of 'a's NOT followed by "b$", which would be the first character following all those 'a's. And, since your pattern is "a*b", not "a+b", the 'b' matches the pattern.

Like I say, I understand the result, but you might want a second opinion of my interpretation.

Winston

PS: If you understand regexes, then you need to also understand that matches("a*b") is the same as find("^a*b$") (well, almost).
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic