This week's book giveaway is in the OCPJP forum.
We're giving away four copies of OCA/OCP Java SE 7 Programmer I & II Study Guide and have Kathy Sierra & Bert Bates on-line!
See this thread for details.
The moose likes Java in General and the fly likes What's the Purpose of the Boundary Matchers Caret and Dollar Sign? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCA/OCP Java SE 7 Programmer I & II Study Guide this week in the OCPJP forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "What Watch "What New topic
Author

What's the Purpose of the Boundary Matchers Caret and Dollar Sign?

Kevin Simonson
Ranch Hand

Joined: Oct 22, 2011
Posts: 114
I wanted to do some pattern detection using regular expressions, so I went to "http://docs.oracle.com/javase/7/docs/api" and looked up class {String} and scrolled down to the section for method {matches()}, where the documentation said, "Tells whether or not this string matches the given regular expression." I clicked on "regular expression", and that took me to the page for class {Pattern}, down to the section labeled <Summary of regular-expression constructs>. I scrolled even further down to the subsection labeled <Boundary matchers>. The caret was listed as a boundary matcher for "The beginning of a line", and the dollar sign was listed as a boundary matcher for "The end of a line". So I wrote the following code:

I expected the call to {actual.matches( expectedOne)} to return {true}, because it had no dollar sign at the end, so it wasn't anchored on the end. But it returned {false}. The call to {actual.matches( expectedTwo)} did return {true}. What's the use of the two anchors, caret and dollar sign, if regular expressions have to have a ".*" at the end, if you want to match a prefix of the line? I mean, I could have written the code:

without the carets, and it would have had the same effect. So why even have the two anchors?

Kevin S
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42374
    
  64
String.matches tries to match the entire string - so there are implicit "^" and "$" at the beginning and end. That's the difference between Matcher.matches and Matcher.find.

The caret is important if you want to find "The quick brown fox" only once in "The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog."; otherwise you'll find it twice.


Ping & DNS - my free Android networking tools app
R. Jain
Ranch Hand

Joined: Aug 11, 2012
Posts: 375
    
    1

It's not mentioned in the docs explicitly, but String#matches(regex) method by default matches at the beginning and till the end of the string. That means, the anchors - ^ and $ are implicit.
If you try the same example with Matcher#find() method, you'll understand the use of anchors. You will get the expected output with that method.
Try out this simple example:


You'll get the following output:
Pattern
Not Matches

Now try changing the regex from "^Pattern" to "^matcher" and see what output you get. I guess that should make it clear. You can frame more such example, to understand it clearly.

To learn more about regex, you can follow - Regular-Expressions Tutorial
Stephan van Hulst
Bartender

Joined: Sep 20, 2010
Posts: 3647
    
  17

Why can you put the character 'a' in a regular expression? Because sometimes you want to match a String containing the letter 'a'. And sometimes you want to match a String containing an end or a beginning of a line.

Let's say you want to read every character at the beginning of a line. Here's how you could do it:
Note that I haven't tested this, and you also need to switch
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: What's the Purpose of the Boundary Matchers Caret and Dollar Sign?