• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

searching for "\d*" in a string

 
J Brewer
Ranch Hand
Posts: 46
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In the following question:



It says that the answer is E, but I get: 0123445. I don't understand where the '6' comes in?
 
wise owen
Ranch Hand
Posts: 2023
 
J Brewer
Ranch Hand
Posts: 46
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks! I saw this in the K&B book, and on a mock exam, and I thought it was a mistake...
 
J Brewer
Ranch Hand
Posts: 46
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
but of course knew it was much more likely that I was mistaken.
 
Vijay Raj
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Its a greedy quantifier and that's why a zero length search was applied at the end of the string giving a 6 at the end. Fine. Why wasn't the zero length search applied at the beginning of the string which would have resulted "001234456".

regards,
vijay.
 
Henry Wong
author
Marshal
Pie
Posts: 20881
75
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Vijay Raj:
Its a greedy quantifier and that's why a zero length search was applied at the end of the string giving a 6 at the end. Fine. Why wasn't the zero length search applied at the beginning of the string which would have resulted "001234456".

regards,
vijay.


The zero length result was applied at the beginning of the string, which is why the first value is zero. Are you asking whether the beginning of the string should be applied twice?

Henry
 
Vijay Raj
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The zero we got in the answer is because of 'a', the first character in the input string, right.
a - prints 0
b - prints 1
3 - prints 234 (2 being m.start() and 34 being m.group())
4 - prints nothing because its already been visited
e - prints 4
f - prints 5
Atlast, prints 6 where the zero length search is performed. Its because f lies between index 5 and index 6. Since the while loop will go till the end, that is, till the length of the string, it performs a zero lemgth match. Am I right till here? I just need to confirm whether I am going in the right direction or not.

If yes, then why is there not a zero length match at the beginning, that is, at index 0.

regards,
vijay.
 
Henry Wong
author
Marshal
Pie
Posts: 20881
75
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The zero we got in the answer is because of 'a', the first character in the input string, right.


No... "a" does not match the regular expression -- neither does "b", "e", or "f". If "a" did match the regular expression, then the output would have been "0a", instead of "0".

Henry
 
Vijay Raj
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The regex engine goes to check the character between index 0 and index 1, finds zero or more '\d's there. Therefore, return the m.start() as 0 and returns "" as m.group() because it found no '\d'. Similarly, it goes to check the character between index 1 and index 2 and so on. After checking out the character in between index 5 and index 6, it goes to index 6 to do a zero length match.

Now, what I wanted to ask was that why didn't it do a zero length match in the beginning, at index 0.

regards,
vijay.
 
Henry Wong
author
Marshal
Pie
Posts: 20881
75
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Now, what I wanted to ask was that why didn't it do a zero length match in the beginning, at index 0.


Here is the breakdown of the results

0 - zero length match before the first character -- at index 0
1 - zero length match after the previous match -- at index 1
234 - A match of "34" at index 2
4 - zero length match after the previous match -- at index 4
5 - zero length match after the previous match -- at index 5
6 - zero length match after the previous match -- at index 6

Henry
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic