aspose file tools
The moose likes Beginning Java and the fly likes Question about regular expressions Big Moose Saloon
  Search | Java FAQ | Recent Topics
Register / Login


Win a copy of The Mikado Method this week in the Agile and other Processes forum!
JavaRanch » Java Forums » Java » Beginning Java
Reply Bookmark "Question about regular expressions" Watch "Question about regular expressions" New topic
Author

Question about regular expressions

Abdul Rehman
Ranch Hand

Joined: Nov 07, 2006
Posts: 65
Hi to all.
Consider the following code:-



The program is run with the following command:-

java Regex "\d*" ab34ef

It produces the output:-

01234456

I tried modifying the while loop, making it like this:-



Running with the same command-line, I got the following output:-

New Iteration.
0
New Iteration.
1
New Iteration.
234
New Iteration.
4
New Iteration.
5
New Iteration.
6


Can someone please explain in detail WHY do we get such outputs?

Best Regards,
Abdul Rehman.
Abdul Rehman
Ranch Hand

Joined: Nov 07, 2006
Posts: 65
I figured out the problem myself!

Let me explain it for other beginnners (like me.) The pattern used in the above example is: \d*
The \d is a pre-defined character class, matching digits i.e. 0-9. The '*' quantifier used with \d means "any digit, zero or more times."
Moving further, the method find() returns true if any portion of the pattern is matched by any portion of the supplied character sequence. Since, in this case, the pattern is "any digit, zero or more times", it matches just everything. Where there is a character, digit still exists zero times!

Skimming through the while loop, we find that for the first two cases, find() returns true and start() returns the index. But, since, nothing has been matched (digit, zero times), therefore, group() returns "". The '+' operator thus acts as a string concatenation operator to produce "0". In the same way, "1" is produced.
"234" is produced together in one cycle, as a result of combining of "2" from start() and "34" from group() ["34" exists in supplied char. seq.]
The characters "4", "5" and "6" are also produced in exactly the same way as "1" and "2".

Thus the output is fully explained...
 
I agree. Here's the link: http://zeroturnaround.com/jrebel - it saves me about five hours per week
 
subject: Question about regular expressions
 
Similar Threads
Regex Program
regex
Regex Doubt
Regex problem
quantifier