| Author |
java.util.regex
|
ishmayel vemuru
Ranch Hand
Joined: Jun 13, 2007
Posts: 45
|
|
Hi This is Ishmayel... I am going to write my SCJP on 25th of this month..I am unable to understand this following example any one help me please.. import java.util.regex.*; class Regex { public static void main(String [] args) { Pattern p = Pattern.compile(args[0]); Matcher m = p.matcher(args[1]); boolean b = false; System.out.println("Pattern is " + m.pattern()); while(b = m.find()) { System.out.println(m.start() + " " + m.group()); } } } java Regex "\d\w" "ab4 56_7ab" I am unable to understand what strat method and group methods will do. thanks in advance... Ishmayel.
|
 |
Taariq San
Ranch Hand
Joined: Nov 20, 2007
Posts: 189
|
|
Hi Ishmayel You should read the api when you're not sure what the methods are doing, that's where you'll really learn java. For instance start() says "Returns the start index of the previous match." and group() says "Returns the input subsequence matched by the previous match."
|
 |
Deepak Bala
Bartender
Joined: Feb 24, 2006
Posts: 6601
|
|
|
ishmayel what is the source of this question ?
|
SCJP 6 articles - SCJP 5/6 mock exams - SCJP Mocks - SCJP 5 Mock exam (Word document ) - SCJP 5 Mock exam in Java.Inquisition format
|
 |
ishmayel vemuru
Ranch Hand
Joined: Jun 13, 2007
Posts: 45
|
|
Hi Taariq San and John Meyers.. Thanks for your reply's .... I was little bit busy with my work that's why I didn't ask my query properly.. Please once check the following... Pattern p = Pattern.compile(args[0]); Matcher m = p.matcher(args[1]); boolean b = false; System.out.println("Pattern is " + m.pattern()); while(b = m.find()) { System.out.println(m.start() + " " + m.group()); } % java Regex "\d\w" "ab4 56_7ab" Produces the output: Pattern is \d\w 4 56 7 7a Here is my dubt... \d --------- A digit \w --------- A word character (letters, digits, or "_" (underscore)) it's ok.. I read the API and K&B book... m.find()---->whenever the pattern is macthed in the input string that will return true start()---> Returns the start index of the previous match. group()---> Returns the input subsequence matched by the previous match As per my analysis.. index of input string--> 0123456789 input string--> ab4 56_7ab pattern--> \d\w As per the my thinking patten will be match..in three times 1st time : "4 " start()method return : 4 group() method return : 56 ...(Not able to understand this) 2nd time : "6_ " start()method return : 6 group() method return : 7a as per the firs iteration 3rd time : "7a" start()method return : 7 group() method return : b ... like this I am thinking any one help me to get the correct way to analyze the regex example in correct way.. Thanks in advance.. Ishmayel.
|
 |
Swathi Kota
Ranch Hand
Joined: Jun 04, 2008
Posts: 52
|
|
To John Meyers, This question is from K&B book. I too did not understand how the pattern \d\w resulted in the output. Can somebody help me on this ??
|
SCJP 6, SCWCD 5
Success is how high you bounce when you hit bottom
|
 |
Daniel Del Moral
Ranch Hand
Joined: May 24, 2008
Posts: 32
|
|
The problem is the digit 6. As you noticed, the first match would the 56 as is the first occurrence of a digit (\d) followed by a word character (\w). Now, as the digit 6 has already been evaluated, the search for the next match will begin exactly after the 6, therefore would be 7b.
|
SCJP 5, SCWCD 5
|
 |
Daniel Del Moral
Ranch Hand
Joined: May 24, 2008
Posts: 32
|
|
This is kind of how the regex algorithm works: Pattern \d\w Matcher ab4 56_7ab Let's start in the beginning of the matcher: Is 'a' a \d? No, let's move on to the next character Is 'b' a \d? No, let's move on to the next character Is '4' a \d? Yes, so let's see what comes next Is ' ' a \w? No, let's go back Is ' ' a \d? No Is '5' a \d? Yes Is '6' a \w? Yes! pattern matched! 56. Let's start again after the match Is '_' a \d? No Is '7' a \d? Yes Is 'a' a \w? Yes! pattern matched! 7b. Let's start again after the match Is 'b' a \d? String ends. That's it. [ June 05, 2008: Message edited by: Daniel Del Moral ]
|
 |
ishmayel vemuru
Ranch Hand
Joined: Jun 13, 2007
Posts: 45
|
|
Hi Daniel Del Moral, Thanks for for your wonderful explanation.... Can you see this bellow example.. Pattern p = Pattern.compile(args[0]); Matcher m = p.matcher(args[1]); boolean b = false; while(b = m.find()) { System.out.print(m.start()+" " ); } >java Regex "\d\w" "ab4 56_7ab" 4 7 >java Regex "\d*" "ab4 56_7ab" 0 1 2 3 4 6 7 8 9 10 as per you explanation whenever digit find in the input string that will return true and comes into while loop otherwise return FALSE will not comes into while loop. Is 'a' a \d? No, let's move on to the next character But Comming to while loop and returning index of the 'a' why? Is 'b' a \d? No, let's move on to the next character But Coming to while loop and returning index of the 'b' why? Is '4' a \d? YES, let's move on to the next character Coming to while loop and returning index of the '4' OK Is ' ' a \d? No, let's move on to the next character But Coming to while loop and returning index of the ' ' why ? Is '5' a \d? YES, let's move on to the next character coming to while loop returning index of the '5' OK Is '6' a \d? YES, let's move on to the next character But coming to while loop and returning index of the '6' OK Is '_' a \d? No, let's move on to the next character But coming to while loop and returning index of the '_' why? Is '7' a \d? YES, let's move on to the next character coming to while loop and returning index of the '7' OK Is 'a' a \d? No, let's move on to the next character But coming to while loop and returning index of the a why ? Is 'b' a \d? No, let's move on to the next character But coming to while loop and returning index of the 'b' why ? If any one have idea help me.. Thanks in advance.. Ishmayel.
|
 |
Mustafa Musaji
Ranch Hand
Joined: May 03, 2008
Posts: 52
|
|
|
Because * means 0 or more occurrences, it will always return true.
|
SCJP 5.0 - Passed
|
 |
robert stannard
Ranch Hand
Joined: Jun 02, 2008
Posts: 37
|
|
Hi ishmayel, I also find this very confusing ! But what I think is happening here is that because you're using the "*" (called "greedy" qualifier) it forces the regex engine to examine and return every start position of a character whether its a match or not, thats why the code goes into the While loop for every character. Whats interesting about your output is that "5" is missing, and thats because the match at position 4 is actually matching "56" and so the output starts at the next position after "6" which is position 6. I hope this helps.
|
SCJP 1.5
|
 |
Swathi Kota
Ranch Hand
Joined: Jun 04, 2008
Posts: 52
|
|
Thanks a lot Daniel for your explanation ! In the same way can anyone also explain how greedy and reluctant quantifiers work with an example. Also please explain the previous example.
|
 |
Daniel Del Moral
Ranch Hand
Joined: May 24, 2008
Posts: 32
|
|
Hi ishmayel vemuru! I see the misunderstanding, it's all about what's returning the find method. The find method will perfom the regex algorithm that I explained above UNTIL IT FINDS A MATCH IN THE REMAINING STRING. If it finds a match it returns true, it it don't, returns false. This is the example re-explained. - Starts while. - Call to m.find() s 'a' a \d? No Is 'b' a \d? No Is '4' a \d? Yes Is ' ' a \w? No Is ' ' a \d? No Is '5' a \d? Yes Is '6' a \w? Yes! pattern matched! 56. - m.find() returns true, m.start() is 4, m.group() is 56. While continues - Call to m.find() again, it continue with the remaining string match s '_' a \d? No Is '7' a \d? Yes Is 'a' a \w? Yes! pattern matched! 7b. Let's start again after the match - m.find() returns true, m.start() is 7, m.group() is 7b. While continues Is 'b' a \d? String ends. - m.find() returns false, while ends.
|
 |
ishmayel vemuru
Ranch Hand
Joined: Jun 13, 2007
Posts: 45
|
|
Hi... Thanks to All.. Now I understood the regex examples much better.. Once again Thank you All... Ishmayel.
|
 |
 |
|
|
subject: java.util.regex
|
|
|