aspose file tools*
The moose likes Java in General and the fly likes regexp Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "regexp" Watch "regexp" New topic
Author

regexp

Max Rahder
Ranch Hand

Joined: Nov 06, 2000
Posts: 177
My goal is to come up with a regex pattern that allows any string beginning with "B-" to match. I.e., the strings "B-17" and "B-hithere" should both match.

I'm in an environment where I mustJakarta regex library. Regexp is found at the Jakarta regexp home page

I can't figure out how to code the regex pattern.

Here's a simple example. I thought this would be a literal pattern. I.e., the only thing that should match the pattern "B-aa" should be the string "B-aa" itself; but other things match.



Why are so many strings matching? (How do I get it to treat "B-" as a literal?

Thanks for helping!
marc weber
Sheriff

Joined: Aug 31, 2004
Posts: 11343

I'm not experienced with regex (much less with Jakarta), but I'm guessing it would be...

"[Bb]-.*"

That is, match Strings starting with one uppercase or lowercase "B", followed by a hyphen, followed by zero or more (denoted by the asterisk) of any character (denoted by the period).


See http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html
[ June 20, 2005: Message edited by: marc weber ]

"We're kind of on the level of crossword puzzle writers... And no one ever goes to them and gives them an award." ~Joe Strummer
sscce.org
Max Rahder
Ranch Hand

Joined: Nov 06, 2000
Posts: 177
Nope. Here are the results of running your guess:



Having "rumplestilskinB-aa" be a match is still not the behavior I want.

I have made lots of guesses myself, but none has explained what appears to be a fundamental mis-understanding on my part. (I suspect the hypen is the problem. In some contexts a hyper is a range operator. I'm afraid my pattern somehow is matching anything in the range "B" through "aa". The problem with that theory is that if it's true the "-" shouldn't be required in the matching string, so "Baa" should test "true" also, but it doesn't. But I still suspect the hyphen.)

I need the help of someone who knows either regex in general or Jakarta regexp in particular.
marc weber
Sheriff

Joined: Aug 31, 2004
Posts: 11343

The difference is in the behavior of the RE.match method. If you want to start testing for a match at the beginning of the argument String, then use the boundary matcher ^...

"^[Bb]-.*"

Or if you want a simple literal without case variation, just use...

"^B-.*"

I've tested this using the org.apache.regexp package and it works as expected on your examples...

[ June 20, 2005: Message edited by: marc weber ]
marc weber
Sheriff

Joined: Aug 31, 2004
Posts: 11343

Actually, when I said "I'm not experienced with regex," I should have said "I'm not (very) experienced with regex in Java."

But I have been using a form of regex for years with a product called Hyper.Ink, which is an application that converts rich text source material into Lotus Notes documents. Basically, we use regexes to define "link rules" so that Hyper.Ink can convert specific text patterns into hyperlinks among the databases.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: regexp