| Author |
Question about regular expressions
|
Abdul Rehman
Ranch Hand
Joined: Nov 07, 2006
Posts: 65
|
|
Hi to all. Consider the following code:- The program is run with the following command:- java Regex "\d*" ab34ef It produces the output:- 01234456 I tried modifying the while loop, making it like this:- Running with the same command-line, I got the following output:- New Iteration. 0 New Iteration. 1 New Iteration. 234 New Iteration. 4 New Iteration. 5 New Iteration. 6 Can someone please explain in detail WHY do we get such outputs? Best Regards, Abdul Rehman.
|
 |
Abdul Rehman
Ranch Hand
Joined: Nov 07, 2006
Posts: 65
|
|
I figured out the problem myself! Let me explain it for other beginnners (like me.) The pattern used in the above example is: \d* The \d is a pre-defined character class, matching digits i.e. 0-9. The '*' quantifier used with \d means "any digit, zero or more times." Moving further, the method find() returns true if any portion of the pattern is matched by any portion of the supplied character sequence. Since, in this case, the pattern is "any digit, zero or more times", it matches just everything. Where there is a character, digit still exists zero times! Skimming through the while loop, we find that for the first two cases, find() returns true and start() returns the index. But, since, nothing has been matched (digit, zero times), therefore, group() returns "". The '+' operator thus acts as a string concatenation operator to produce "0". In the same way, "1" is produced. "234" is produced together in one cycle, as a result of combining of "2" from start() and "34" from group() ["34" exists in supplied char. seq.] The characters "4", "5" and "6" are also produced in exactly the same way as "1" and "2". Thus the output is fully explained...
|
 |
 |
|
|
subject: Question about regular expressions
|
|
|