File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes Regex pattern Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Regex pattern" Watch "Regex pattern" New topic
Author

Regex pattern

Jacob Sonia
Ranch Hand

Joined: Jun 28, 2009
Posts: 172
Hi,

I have these example urls
http://twitter.com/*
http://twitter.com/*/rs

Now * can be anything like user_name, user.name etc

I could come up with only one pattern of extracting but it returns / as well when it is present. Please help me with a more correct one.

This is my java program


Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19545
    
  16

Let's break down your regex:
- (?<=http[s]?://twitter.com/) - a positive lookbehind for http://twitter.com/ and https://twitter.com/. Looks fine to me
- ($|(.*)/|(.*)|\\?=)
--- $ - end of string
--- (.*)/ - anything followed by /
--- (.*) - anything
--- \\?= - a ? followed by =

You clearly specify that you want / inside your match, both in (.*) and in (.*)/
An easy fix: change both occurrences of .* into [^/]*. In other words, anything but a /. That still means you match anything but a / followed by a /, so remove that part. What remains: "(?<=http[s]?://twitter.com/)($|([^/]*)|\\?=)"

By the way, your while loop is actually an if-loop because of the break. So just change it into one.

SCJP 1.4 - SCJP 6 - SCWCD 5 - OCEEJBD 6
How To Ask Questions How To Answer Questions
Jacob Sonia
Ranch Hand

Joined: Jun 28, 2009
Posts: 172
Hey thanks a lot for the reply, it really helped me. Please guide me what book should i read for understanding the basics of regex pattern.

Also i have this problem - Here i want everything after http://abc.com* except http://abc.com/xyz* - means all would be accepted which starts with http://abc.com but the one which starts with http://abc.com/xyz will not be accepted. I tried this, but i think this is not that great, there is some problem to it,it doesn't match the last one.



Raymond Tong
Ranch Hand

Joined: Aug 15, 2010
Posts: 230
    
    2

There is some url above regular expression
http://www.regular-expressions.info/
http://download.oracle.com/javase/tutorial/essential/regex/


This will fail for


You don't have to escape "/" by using "\\/", simply "/" is ok
if sub-domain (www) is optional, you may want to use "?"
you may want to have a slash "/" after your (ae|com)

It may be easier for you to write down the pattern using pen and paper
before turning it to regular expression.
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19545
    
  16

Jacob Sonia wrote:Also i have this problem - Here i want everything after http://abc.com* except http://abc.com/xyz* - means all would be accepted which starts with http://abc.com but the one which starts with http://abc.com/xyz will not be accepted.

Check out java.util.regex.Pattern for negative lookahead. What you basically need:
- http://abc.com
- a negative lookahead for /xyz
- anything else
Jacob Sonia
Ranch Hand

Joined: Jun 28, 2009
Posts: 172
Hi, I tried this after looking at java.util.pattern

String regex ="^http:\\/\\/[\\w-]+\\.abc\\.(com)($|[.* && ?![xyz]*])" ;

Doesn't work either
Raymond Tong
Ranch Hand

Joined: Aug 15, 2010
Posts: 230
    
    2

Jacob Sonia wrote:Hi, I tried this after looking at java.util.pattern

String regex ="^http:\\/\\/[\\w-]+\\.abc\\.(com)($|[.* && ?![xyz]*])" ;

Doesn't work either

Here is more details description for regular expression
http://www.regular-expressions.info/lookaround.html
Jacob Sonia
Ranch Hand

Joined: Jun 28, 2009
Posts: 172
another try String regex ="^http:\\/\\/[\\w-]+\\.abc\\.(ae|com)($|(?!(/xyz).*).*)" ;
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19545
    
  16

You should always check the Javadocs of java.util.regex.Pattern for the syntax. I see you're using a !, but that's not supported in Java. I already told you how to do this, using the negative lookahead.
Jacob Sonia
Ranch Hand

Joined: Jun 28, 2009
Posts: 172
Hi,
But whatever I created is supported. Why do you think that ! Is not supported. For me the pattern works as expected.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Regex pattern
 
Similar Threads
problem in excuting Basicservlet
how to net Send from my java code although receiver has disable it
how to send message from Command prompt to the user in network and who ignore me
How to find if there is a number in a string using regular expressions.
Scrollbar for jtable