Win a copy of Mesos in Action this week in the Cloud/Virtualizaton forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Trouble with regex

 
Raj S Kumar
Ranch Hand
Posts: 48
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have to identify a set of characters in a string, a filepath.

D:\Development\Devlocale\SVN\ABC\ABCD\EN\Hello_EN.rc

I have to pick the characters 'EN' which starts with either \ or _ or . and ends with \ or _ or .

The problem is I don't know where to start with. I have read the Javadocs for the Pattern class but couldn't start.

Could you please show me just to find 'EN' anywhere in the string? I could build on top of that by adding the conditions.

Thanks in Advance.
 
Raj S Kumar
Ranch Hand
Posts: 48
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have got the answer for the basic thing.
I will post if I face any further issue.
 
Peter Taucher
Ranch Hand
Posts: 174
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It would be polite to post a resolution here as well. Other people may benefit from it in the future.

Regex Tutorial -> http://java.sun.com/docs/books/tutorial/essential/regex/

As a starting point for your pattern this might help (but I'm no pro in regex, so maybe you could do it more effective/pretty):
 
Raj S Kumar
Ranch Hand
Posts: 48
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sure, I will post the resolution once it is done.
I just got an answer for the basic thing. A lot more is yet to be done.
 
Raj S Kumar
Ranch Hand
Posts: 48
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
I have to replace the 'EN' with other language codes, such as 'DE' etc in file paths. The condition is 'EN' might start or end with either a '\' or '_' or '.'
Once replaced, the replaced language code should also have the same characters.

D:\Development\Devlocale\SVN\ABC\ABCD\EN\Hello_EN.rc

I have done with the following code(covered all the conditions). Is there any other efficient way of doing it?


Please note that, the language codes (EN, DE) are outputs from methods. I have substituted with hardcoded values here.
 
Rob Spoor
Sheriff
Pie
Posts: 20532
54
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Use a positive lookahead and lookbehind; in other words, only look for EN and put the rest in a lookbehind / lookahead.

For example:
That regex may look odd, but it's quote easy:
- (?<=[\\\\_.]) is a positive lookbehind that matches backslash, underscore or .
- EN is the literal EN
- (?=[\\\\_.]) is a positive lookahead that matches backslash, underscore or .

The secret with lookbehind / lookahead is that its presence (or with negative lookahead / lookbehind the absence) is required but it will not be part of the match. My regex will only match EN, but only if preceded with the characters you've specified.
 
Raj S Kumar
Ranch Hand
Posts: 48
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Rob,
Thanks for the reply and this will surely help me a lot. I would like to understand the regex better to tweak it for my requirement.

Could you please show me a link or a thread where I could understand the expression better?

 
Rob Spoor
Sheriff
Pie
Posts: 20532
54
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You can start with the Javadoc of java.util.regex.Pattern. Lessen: Regular Expressions should also be good.
 
Peter Taucher
Ranch Hand
Posts: 174
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Maybe here:
http://www.regular-expressions.info/lookaround.html

Never used that (lookahead/lookbehind). Rob, you're great!
 
Rob Spoor
Sheriff
Pie
Posts: 20532
54
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I know, I know
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic