This week's book giveaway is in the Jobs Discussion forum.
We're giving away four copies of Java Interview Guide and have Anthony DePalma on-line!
See this thread for details.
The moose likes Java in General and the fly likes Need help with regex expression Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login

Win a copy of Java Interview Guide this week in the Jobs Discussion forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Need help with regex expression" Watch "Need help with regex expression" New topic

Need help with regex expression

Billy Sclater
Ranch Hand

Joined: Nov 18, 2012
Posts: 145

I'm trying to write some code to scan an html file for, and return links to reviews of a game.
The file contains long URLs in quotes "". The specific URLs I am looking for will contain
the 'game name', and the word 'review'.

This is what I have so far:

Pattern p = Pattern.compile("http://.*?(?=.*gamename\\sreview).*?(?=\")");

I'm struggling somewhat! Can anyone help?
Billy Sclater
Ranch Hand

Joined: Nov 18, 2012
Posts: 145

I figured it out:

String MyRegex= "http://www[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]" + "gamename" + "[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]";
Richard Tookey

Joined: Aug 27, 2012
Posts: 1166

Now I'm a great fan of regular expressions but that is just dreadful and provides ammunition for the guys round here who preach that regex were invented by the Devil.

In your OP you specified that the target would contain the word 'review' but I don't see it in your regex. Also, I assume the game name is supplied as a variable so this needs to be escaped so that none of it is interpreted as regex meta characters.
Billy Sclater
Ranch Hand

Joined: Nov 18, 2012
Posts: 145

Yeh, I decided to ditch review. When you say 'escape' the game name. Do you mean to just add a forward slash in front of it? Could you show me what you mean?
Jeanne Boyarsky
author & internet detective

Joined: May 26, 2003
Posts: 33132

Consider breaking it up to make more readable (and have less duplication). For example, a first iteration of refactoring could be:

As a second iteration, you could use character classes such as digit or word character. Or extract the common parts into another String. the idea is to have the final reg exp have less to read.

[OCA 8 book] [Blog] [JavaRanch FAQ] [How To Ask Questions The Smart Way] [Book Promos]
Other Certs: SCEA Part 1, Part 2 & 3, Core Spring 3, TOGAF part 1 and part 2
Billy Sclater
Ranch Hand

Joined: Nov 18, 2012
Posts: 145

That's awesome, thanks!
I agree. Here's the link:
subject: Need help with regex expression
It's not a secret anymore!