jQuery in Action, 2nd edition*
The moose likes Java in General and the fly likes Pattern matching problem Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Pattern matching problem" Watch "Pattern matching problem" New topic
Author

Pattern matching problem

Rahul Ba
Ranch Hand

Joined: Oct 01, 2008
Posts: 205
I am trying to retrive the body contents and file tag conetents if any.

String str = "<body>1</body><file>myFile1</file><body>2</body><body>3</body><file>myFile2</file>";
Pattern pattern = Pattern.compile("<body>(.*?)</body><file>(.*?)</file>");
Matcher matcher = pattern.matcher(str);
while(matcher.find()) {
System.out.println("BValue:"+matcher.group(1));
System.out.println("FValue:"+matcher.group(2));
}

I am getting the output in this way
BValue:1
FValue:myFile1
BValue:2</body><body>3
FValue:myFile2

See, mu BValue is not coming properly, I want output in following way...I know this is not in proper pattern..but still can we achieve this output?


BValue:1
FValue:myFile1
BValue:2
BValue:3
FValue:myFile2

Sebastian Janisch
Ranch Hand

Joined: Feb 23, 2009
Posts: 1183
I am not a fan of engaging the heavy regex engine when string positions would do just fine, as they do in your case.

You could rewrite your program using code.indexOf("<body") etc. and loop over all occurances.


JDBCSupport - An easy to use, light-weight JDBC framework -
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19656
    
  18

The problem is that your regular expression requires a <file> to come after an <body>. That's why the non-greedy .*? will match "2</body><body>3". To prevent this, make the <file> optional.


SCJP 1.4 - SCJP 6 - SCWCD 5 - OCEEJBD 6
How To Ask Questions How To Answer Questions
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Pattern matching problem
 
Similar Threads
Regular Expression Query
Regular Expression - Set of Values
Need help in regular expression
Tokenizing string in Java
Regular expression to take integers out of a string