This week's book giveaway is in the Design forum.
We're giving away four copies of Design for the Mind and have Victor S. Yocco on-line!
See this thread for details.
Win a copy of Design for the Mind this week in the Design forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

When doing Matcher start on pattern \d* returned index seems off

 
Rick Reumann
Ranch Hand
Posts: 281
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm a bit confused why Matcher's start method is returning an index that I would think would be out of bounds on the following text to search...



Result:
0
1
2
3 34
5
6
7
8 Why this index?

According to the spec on the "start" method of Matcher http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Matcher.html it says: "Returns the start index of the previous match."

To me this doesn't seem to make sense. The previous match of the char "f" has the starting index of 7. I understand it's at position 7,8 but the docs claim that it returns the "start index" (not ending index) of the previous match. Also, if it is supposed to return the ending index then I would think the first thing printed would be a '1' not a 0. I'm sure I'm just missing something simple here.

Thanks for any help.
 
Henry Wong
author
Marshal
Pie
Posts: 20997
76
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Your regular expression can match a empty string -- in fact, most of the matches are empty matches.

Index 8 is the empty match, at the end of your string.

Henry
 
Matt Russell
Ranch Hand
Posts: 165
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This link has a good explanation for a similar Regexp question: http://faq.javaranch.com/view?ScjpFaq#kb-regexp
 
athakur athakur
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I got the explation you guys gave above:

I have one doubt though.

if I modify the code to something like this:

public static void main(String [] arg) {
Pattern p = Pattern.compile("\\d*?");
Matcher m = p.matcher("ab34ef");
boolean b = false;
while(b = m.find()) {
System.out.print(m.start() + m.group());
}
}

Change the pattern from greedy to relucant, I got the output: 0123456

Can any one please explain me this ? As it is relectant but it should alleast print 3 and 4 with index 2 and 3 respectively.

Thanks
 
Henry Wong
author
Marshal
Pie
Posts: 20997
76
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Can any one please explain me this ? As it is relectant but it should alleast print 3 and 4 with index 2 and 3 respectively.


Reluctant means that it should match the smallest match possible -- and in this case, the smallest possible is an empty match.

Henry
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic