This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
Let's break down your regex:
- (?<=http[s]?://twitter.com/) - a positive lookbehind for http://twitter.com/ and https://twitter.com/. Looks fine to me
--- $ - end of string --- (.*)/ - anything followed by /
--- (.*) - anything
--- \\?= - a ? followed by =
You clearly specify that you want / inside your match, both in (.*) and in (.*)/
An easy fix: change both occurrences of .* into [^/]*. In other words, anything but a /. That still means you match anything but a / followed by a /, so remove that part. What remains: "(?<=http[s]?://twitter.com/)($|([^/]*)|\\?=)"
By the way, your while loop is actually an if-loop because of the break. So just change it into one.
Hey thanks a lot for the reply, it really helped me. Please guide me what book should i read for understanding the basics of regex pattern.
Also i have this problem - Here i want everything after http://abc.com* except http://abc.com/xyz* - means all would be accepted which starts with http://abc.com but the one which starts with http://abc.com/xyz will not be accepted. I tried this, but i think this is not that great, there is some problem to it,it doesn't match the last one.
You should always check the Javadocs of java.util.regex.Pattern for the syntax. I see you're using a !, but that's not supported in Java. I already told you how to do this, using the negative lookahead.
Joined: Jun 28, 2009
But whatever I created is supported. Why do you think that ! Is not supported. For me the pattern works as expected.