• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

doubt in chapter 6- Regex from Kathy Sierra Book

 
Ranch Hand
Posts: 79
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I am typing the following which is taken from Chapter 6 - pages 496 and 497

The way to think about this is to consider the name greedy. In order for the second answer to be correct, the regex engine would have to look (greedily) at the entire source data before it could determine that there was an xx at the end. So in fact, the second result is the correct result because in the original example we used the greedy quantifier *. The result that finds two different sets can be generated by using the reluctant quantifier *?. Let's review:


source: yyxxxyxx
pattern: .*xx

is using the greedy quantifier * and produces

0 yyxxxyxx

If we change the pattern to

source: yyxxxyxx
pattern: .*?xx

we're now using the reluctant qualifier *?, and we get the following:

0 yyxx
4 xyxx


The greedy quantifier does in fact read the entire source data, and then it works backward (from the right) until it finds the rightmost match. at that oint, it ncludes everything from earlier in the source data up to and including the data that is part of the rightmost match.


I am not able to understand the text that I have highlighted in red. Could someone please help me to understand this portion.
 
Ranch Hand
Posts: 44
Eclipse IDE Firefox Browser Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am not sure if I can explain it better than the KB book here but I'll try.
Your pattern in simple english is "give me anything that is zero or more characters followed by xx". If you look at your source, without consuming it, you have the following matches.
yyxx
yyxxx
yyxxxyxx

The greedy quantifier, being greedy in nature, wants to consume as much as it can so it returns the largest match (yyxxxyxx).
The reluctant quantifier is the opposite and wants to consume as less as it can before moving on (yyxx). After moving on, the reluctant quantifier sees another match (xyxx) and returns that as well.
 
Loganathan Karunakaran
Ranch Hand
Posts: 79
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks for your explanation Asad.
I also looked at the link http://download.oracle.com/javase/tutorial/essential/regex/quant.html
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic