Sweet! Thanks for the link (I'll probably be referring to this often.)
So: "The asterisk (*) is a "greedy quantifier," specifying that whatever preceeds it (in this case, any digit) should be matched zero or more times. By allowing for zero occurrances, a match of zero length is possible. Because a match of zero length is possible, the find() method will check the index following the last character of input."
A match of zero length for greedy quantifiers seems weird. Something to definitely keep in mind.
Joined: Feb 07, 2007
by the way, regex is something that took me quite some time to understand. It doesn't always behave the way you think it will.
For example if you take the same code you gave but change the pattern to "\\d*?". What do you think the result will be?
I just checked it, and Javier is right. Only greedy quantifiers are on the exam. But reluctant quantifiers are in the K&B book so it's not totally irrelevant.
What you probably expected was that m.group() would print '3' at position 2 and '4' at postition 3 wich would give the output: 012334456
The expression "\\d*?" is searching for 0, 1 or more occurences of digits, and since it's a reluctant quantifier it will give back AS LITTLE AS POSSIBLE. At position 2 it comes across '3' wich is a digit. Now instead of returning this digit, it in fact returns 0 digits because that is the smallest value it can return while still following the expression. So the output is indeed: 0123456
I hope this explanation makes it clearer, because I find it hard to explain. Try playing with the next piece of code if it's still unclear to you about what the regex returns.