All code in my posts, unless a source is explicitly mentioned, is my own.
SCJP 6 | SCWCD 5 | Javaranch SCJP FAQ | SCWCD Links
All code in my posts, unless a source is explicitly mentioned, is my own.
Ruben Soto wrote:And, since we are talking about greedy and reluctant quantifiers:
?? will only match the empty string
*? will only match the empty string
+? will only match one instance of the item its quantifying
Henry Wong wrote:
Ruben Soto wrote:And, since we are talking about greedy and reluctant quantifiers:
?? will only match the empty string
*? will only match the empty string
+? will only match one instance of the item its quantifying
Example please. What do you mean that these qualifiers "will only match the empty string"? Whether they match an empty string is also dependent on the whole pattern, not just the qualifier.
Henry
All code in my posts, unless a source is explicitly mentioned, is my own.
I think that "(anything)??" will only match the empty string. The same for "(anything)*?", ".??", ".*?", "1??", "a*?", etc. Since the lower boundary of ? and * is 0 occurrences, making them reluctant makes them only match zero instances (the empty string.)
If used as a part of a bigger pattern, they won't affect the result because of this reason.
For example, the pattern "1234.?(anything)??abc(anything)*?" is equivalent to the pattern "1234.?abc"
All code in my posts, unless a source is explicitly mentioned, is my own.
I think that the strategy is "match as little as possible, while allowing the overall pattern to match starting at the earliest index of the source string." That is not as catchy, but seems to more accurately depict how the regex engine handles this.
Henry Wong wrote:
I think that the strategy is "match as little as possible, while allowing the overall pattern to match starting at the earliest index of the source string." That is not as catchy, but seems to more accurately depict how the regex engine handles this.
Not really. There are two things going on -- and they are only somewhat related.
First, the regex. In this case, the whole pattern matches zero characters, so it will always match zero characters. There is no bigger pattern that will force it to match more -- so it will always match zero length in this example.
Second, the find() method. The method will start from beginning of the string looking for a match. If one is found, it will start the next match at the end of the one that is found. Unless, of course, the match is zero length, then it will go to the next character to look for the next match. And finally, if it doesn't find a match at the current location, it will move to the next location.
The behavior of the find() method is the same, regardless of the regex pattern. So, it is better to keep those two things in mind, and try to keep them separated.
Henry
All code in my posts, unless a source is explicitly mentioned, is my own.
Consider Paul's rocket mass heater. |