• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Reg Ex Question

 
Greenhorn
Posts: 24
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Can some one clear below output , i am preety much confused ( From KB book)

import java.util.regex.*;
class Regex {
public static void main(String [] args) {
Pattern p = Pattern.compile(args[0]);
Matcher m = p.matcher(args[1]);
boolean b = false;
System.out.println("Pattern is " + m.pattern());
while(b = m.find()) {
System.out.println(m.start() + " " + m.group());
}
}
}

% java Regex "\d\w" "ab4 56_7ab"

Produces the output
Pattern is \d\w
4 56
7 7a


Please explain how this output is produced.

Ashish
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
How about if you post how you think this works (or how it fails to work according to your understanding)? Explaining something is a great way to make sure you really understand it. It's also the approach we prefer here at JavaRanch.
 
Greenhorn
Posts: 25
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Ashish,
you have entered Exp "\d\w" "ab4 56_7ab"

The start() method returns the index of the matching pattern in your input string & group() method groups the matching pattern starting from the index position returned by start() method.

\d -search for a digit in the pattern
\w search for a word in the pattern,a word can be digit,letter or "_"

your output is:

4 56 (your first matching statrs from index 4,that is 5 and second match is 6(digit,letter,"_"))
7 7a (second match starts from index 7,that is 7 and second match is a(digit,letter,"_"))

you might be confuse with the underscore,here underscore is not in the result because the 6 just before underscore is already matched & is not included in the next matching.If you insert a new digit before underscore then you will surely get underscore in the result.

I hope i have provided your answer,if there is any confusion then reply me

Mamta
 
Ashish Soni
Greenhorn
Posts: 24
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks i got it now.
 
Ranch Hand
Posts: 113
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Mamta,

As you have mentioned that there is a digit in the pattern , the digit can start from the index 2 instead of 4 .As I see that the digit is starting from 2 the output should start from index no 2.

Please correct me if I am wrong.

thanks,
Jyothsna
 
Jyothsna Panchagnula
Ranch Hand
Posts: 113
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello All,

Can any one throw some focus on the output from the program posted above?

Jyothsna
 
Ranch Hand
Posts: 231
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jyothsana:
As I see that the digit is starting from 2 the output should start from index no 2.



search string : "ab4 56_7ab"
pattern : "\d\w"
The pattern implies consecutive occurrence of a (digit)+(digit,letter,_)

Digit starts at index no.2 but the following character is a *space*, hence it does not match the pattern.
Moreover "6_" also matchs the pattern, but since "6" has already been used in the match "56" hence "6" does not qualify to give a match of "6_".
 
reply
    Bookmark Topic Watch Topic
  • New Topic