File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes Text Parsing with Regex Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Text Parsing with Regex " Watch "Text Parsing with Regex " New topic
Author

Text Parsing with Regex

Andy Bowes
Ranch Hand

Joined: Jan 14, 2003
Posts: 171
Hi

I am trying to find a generic method to extract information from strings like:

"id:36 sub:001 dlvrd:001 submit date:0704270919 done date:0704270919 stat ELIVRD err:000"

I would like to be able to extract the values of particular items by defining a Regex style pattern. I am a bit of a regex newbie and I am not sure how to go about this. I had thought about using StringTokenizer but some of the value names have spaces in them and unfortunately I have no control over the format of these strings.

Can you use regex expressions to 'name' specific variable sections of the patterns so that they can be extracted by a Matcher?

Any help will be very welcome, thanks


Andy Bowes<br />SCJP, SCWCD<br />I like deadlines, I love the whoosing noise they make as they go flying past - Douglas Adams
Omer Haderi
Ranch Hand

Joined: Sep 27, 2006
Posts: 42
Hi,

The java.util.Scanner has much more capabilities than StringTokenizer. Have a look at that class, you can delimited with regex also, moreover it has methods to return specific types.

Cheers.
Andy Bowes
Ranch Hand

Joined: Jan 14, 2003
Posts: 171
Hi Omer

Thanks for the reply.
I have worked it out using Regex and it works like a dream

Andy
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
How did you handle the names with spaces in them? I have to guess the values cannot have spaces, or you'd never know where a value ends and a name begins.


A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Andy Bowes
Ranch Hand

Joined: Jan 14, 2003
Posts: 171
A Regular Expression can include spaces in the Pattern e.g.


[ May 02, 2007: Message edited by: Andy Bowes ]
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
Oh, cool, you made a pattern with all known keys and got them all with one match. I was thinking of something generic to get one key:value at a time which would survive insertion of a new key:value one day but wouldn't be near as self documenting.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Text Parsing with Regex