• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Regex: Need clarification on two issues

 
Ranch Hand
Posts: 434
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Sierra/Bates Chapter 6, Question 1



Run-time invocation:

java Regex2 "\d*" ab34ef

From what I understand, pattern is told to look for 0 or more digits. If it finds 0 or more digits, it prints the index number (invoked by m.start()) and the group that matches this pattern (invoked by m.group())

I trace the program and get the following output:

012345

However, the output of this code is

01234456

Question 1: Is this because the program counts the index past f as 0 or more occurrences of digits?

Question 2: Please clarify if I understood the following correctly

Sierra/Bates explanation of the answer says

The start() method returns the starting position of the previous match because, again, we said find 0 to many occurrences.



It seems, the find() method consumes the matcher, whereas start() returns the starting index of the consumed matcher. So if 34, which starts at index 2 is consumed, start() returns to index 2, which is printed by the System.out.print statement. Then the next matching pattern starts at index 4 because (again) 34 has been consumed
 
Ranch Hand
Posts: 1051
Eclipse IDE Firefox Browser
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
//java Regex2"\d*" ab34ef

first mark the index of input

a b 3 4 e f
0 1 2 3 4 5



lets see

when we use
Start() method it will return the starting index of the matching pattern.

so when we do group()
it will return the matching pattern.....
as you know "\d*" atleast zero or many time digit should be there....

so start method will return the starting index when it gets the one......
so 012 is obvious because till then there is no group method invokes
now group method invokes as it find a input sequence so it will print 34
so index reaches "e" which will again print index as 4 and continues printing the starting index

so the output is now
0 1 2 34 4 5 6
 
Shanky Sohar
Ranch Hand
Posts: 1051
Eclipse IDE Firefox Browser
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
now try to find out what would be the output when i do like this


//java Regex2"\d" ab34ef
 
Ranch Hand
Posts: 400
Hibernate Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Sandra Bachan wrote:Sierra/Bates Chapter 6, Question 1



Run-time invocation:

java Regex2 "\d*" ab34ef

From what I understand, pattern is told to look for 0 or more digits. If it finds 0 or more digits, it prints the index number (invoked by m.start()) and the group that matches this pattern (invoked by m.group())

I trace the program and get the following output:

012345


Can you elaborate your output?

However, the output of this code is

0123456


Are you sure this the exactly output?
Because this program should return "01234456".

Question 1: Is this because the program counts the index past f as 0 or more occurrences of digits?

Question 2: Please clarify if I understood the following correctly

Sierra/Bates explanation of the answer says

The start() method returns the starting position of the previous match because, again, we said find 0 to many occurrences.



It seems, the find() method consumes the matcher, whereas start() returns the starting index of the consumed matcher. So if 34, which starts at index 2 is consumed, start() returns to index 2, which is printed by the System.out.print statement. Then the next matching pattern starts at index 4 because (again) 34 has been consumed


First of all let me explain you the regex "\d*"
"\d" Matches any digit character(0 - 9)
"*" Match previous token 0 or more times. This is a greedy match, and will match as many character as possible


So what exactly happen when we find "\d*" into "ab34ef".
"\d" will try to find the any digit(0-9) where the * will match previous token 0 or more times therefore m.find() will always return true on each invocation.
Check out the following listed code may be you will have clear view now.


Hope this helps.

Minhaj

PS: click here to check an awesome online regex builder.
 
Sandra Bachan
Ranch Hand
Posts: 434
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
@Minhaj: I mis-typed the correct output. Made the correction to my original post.
 
Minhaj Mehmood
Ranch Hand
Posts: 400
Hibernate Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
So, hope its clear now.
 
Sandra Bachan
Ranch Hand
Posts: 434
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes, it's clear.

I think my source of confusion was the way the Java API explained start()

Returns the start index of the previous match.



Now I understand that start returns the index of the match found by find()

Thanks!!!
 
Ranch Hand
Posts: 430
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I don't know why it printed 6.
Any help?
 
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
http://faq.javaranch.com/java/ScjpFaq#kb-regexp
 
Leandro Coutinho
Ranch Hand
Posts: 430
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Henry Wong wrote:http://faq.javaranch.com/java/ScjpFaq#kb-regexp


Thank you! I think it is strange. It seems like a bug, because it's not the case of "zero-length" match. The index simply doesn't exist (at least should not).
I'm wondering for who it would be useful the number 6 to be printed ...
Maybe they did like this because of the null character in strings from C.
 
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
A parallel scenario is also described and highlighted in Exam Watch Page 500 of K & B.

Zero-length matches can occur in several places:
a) After the last character of source data....

 
reply
    Bookmark Topic Watch Topic
  • New Topic