It's not a secret anymore!*
The moose likes Java in General and the fly likes Need help with regex expression Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Murach's Java Servlets and JSP this week in the Servlets forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Need help with regex expression" Watch "Need help with regex expression" New topic
Author

Need help with regex expression

Billy Sclater
Ranch Hand

Joined: Nov 18, 2012
Posts: 114

I'm trying to write some code to scan an html file for, and return links to reviews of a game.
The file contains long URLs in quotes "". The specific URLs I am looking for will contain
the 'game name', and the word 'review'.

This is what I have so far:

Pattern p = Pattern.compile("http://.*?(?=.*gamename\\sreview).*?(?=\")");

I'm struggling somewhat! Can anyone help?
Billy Sclater
Ranch Hand

Joined: Nov 18, 2012
Posts: 114

I figured it out:

String MyRegex= "http://www[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]" + "gamename" + "[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]";
Richard Tookey
Ranch Hand

Joined: Aug 27, 2012
Posts: 1035
    
  10

Now I'm a great fan of regular expressions but that is just dreadful and provides ammunition for the guys round here who preach that regex were invented by the Devil.

In your OP you specified that the target would contain the word 'review' but I don't see it in your regex. Also, I assume the game name is supplied as a variable so this needs to be escaped so that none of it is interpreted as regex meta characters.
Billy Sclater
Ranch Hand

Joined: Nov 18, 2012
Posts: 114

Yeh, I decided to ditch review. When you say 'escape' the game name. Do you mean to just add a forward slash in front of it? Could you show me what you mean?
Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 30068
    
149

Consider breaking it up to make more readable (and have less duplication). For example, a first iteration of refactoring could be:


As a second iteration, you could use character classes such as digit or word character. Or extract the common parts into another String. the idea is to have the final reg exp have less to read.


[Blog] [JavaRanch FAQ] [How To Ask Questions The Smart Way] [Book Promos]
Blogging on Certs: SCEA Part 1, Part 2 & 3, Core Spring 3, OCAJP, OCPJP beta, TOGAF part 1 and part 2
Billy Sclater
Ranch Hand

Joined: Nov 18, 2012
Posts: 114

That's awesome, thanks!
 
wood burning stoves
 
subject: Need help with regex expression
 
Similar Threads
RedWorm Attack
Need help with my game.
Red Dead Redemption
Jar within Jar?
Graphics Lag on Certain Computers