Originally posted by podonga poron: but if i apply "\\d*" to a88abc i get "013456" WHY ??
At index 0 it matches the empty string preceding the first 'a'. At index 1 it matches "88". At index 3 it matches the empty string between '8' and 'a'. At index 4 it matches the empty string between 'a' and 'b'. At index 5 it matches the empty string between 'b' and 'c'. At index 6 it matches the empty string following 'c'.
The parts that are hardest to understand are:
At index 3: it just finished matching two digits; why does it match again at the index where that match ended?
Answer: The regex is allowed to match zero characters, so it will always match at every position where it's tried.
At index 6: the string is only six characters long, which means the last valid index is 5; how can it match something at index 6?
Answer: it isn't matching a character, it's matching the nothing after the last character. It might help if you think of it as being between the last character and the end of the string, since regexes let you match the end of a string with the '$' metacharacter.
The HORRIBLE book (i hate it) says
I don't know about the book, but I agree that this part of it is horrible. This question is constantly being asked here because the authors did such terrible job of explaining it.