Hi,
I was reading greedy quantifier in K&B. I came across zero-length match which are pretty tricky. Here is a slight modification
to the exam and that's the input given from the console

From the console here is the input given

In my opinion, the instruction is asking to find one or zero match of aa. But here is the output generated

Can someone explain this behavior. Another additional thing, inside the book it is written that the engine backtracks as well. Why is that so?

Been Zaidi wrote:From the console here is the input given

In my opinion, the instruction is asking to find one or zero match of aa. But here is the output generated

No, the ? only binds to the second a, so the pattern that regex is looking for is one 'a' followed by zero or one 'a'
To look for one or zero matches of aa, the regex would be "(aa)?".

Been Zaidi wrote:Another additional thing, inside the book it is written that the engine backtracks as well. Why is that so?

Imagine you have the regex ".*c" and are looking for a match in "abc". Since the * is greedy, the initial match for the beginning of the regex (".*") would be as long as possible, therefore matching the whole string. But then the end of the regex wouldn't fit anymore, so the pattern matching has to backtrack, making .* only match ab instead of abc.

Been Zaidi wrote:From the console here is the input given

In my opinion, the instruction is asking to find one or zero match of aa. But here is the output generated

No, the ? only binds to the second a, so the pattern that regex is looking for is one 'a' followed by zero or one 'a'
To look for one or zero matches of aa, the regex would be "(aa)?".

Been Zaidi wrote:Another additional thing, inside the book it is written that the engine backtracks as well. Why is that so?

Imagine you have the regex ".*c" and are looking for a match in "abc". Since the * is greedy, the initial match for the beginning of the regex (".*") would be as long as possible, therefore matching the whole string. But then the end of the regex wouldn't fit anymore, so the pattern matching has to backtrack, making .* only match ab instead of abc.

Dear Nobert,

I am able to understand the first part of your explanation. It was very helpful. Thanks a lot.

The quoted part, i am a little confused. Can you elaborate it a little simply. Looking forward to hear.

Hello Guyz,
The matcher will search for the atleast one "a". These quantifiers will apply for the exactly one character that means it will find one 'a' and then 0 or 1 a. Then it will find the first 'a' which is based on the criteria one 'a' then 0 or 1 'a'. Then it will find the second match with the 'aa'.