This week's giveaway is in the Android forum.
We're giving away four copies of Android Security Essentials Live Lessons and have Godfrey Nolan on-line!
See this thread for details.
The moose likes I/O and Streams and the fly likes Need help to parse text files. Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "Need help to parse text files." Watch "Need help to parse text files." New topic
Author

Need help to parse text files.

Andrew Roberts
Greenhorn

Joined: Nov 11, 2011
Posts: 2
I am learning Java and need some assistance.

I'm trying to make a program that can scrape some data out of a text file and get certain important parts. The text files contains the source code for an ebay auction. What I am trying to do is grab key parts like the price, item location, etc.

I can get the source code and save it to a file and I can read and write files. What I need to figure out is how to grab that particular part of the file.

I know that I can use regular expressions to match things that I'm looking for, but how can I grab a part of the file that is after the match to the regex? I was able to match a line in the file that has the particular piece of info I want and store that in a separate string, but the line is quite long and I'm not sure how to break it down to get at what I need, or handle a situation where the rest of the information is on the next line.

Ideally I would like to get whatever is left on the line after the regex match, and then do a string tokenizer with a space delimiter which should capture the data and I can have it end when it reaches something like a < or "

I suppose I could take a whole line and tokenize it but there has to be a better way. Any assistance is appreciated.
Greg Charles
Sheriff

Joined: Oct 01, 2001
Posts: 2841
    
  11

Hi Andrew, and welcome to JavaRanch!

If you're parsing XML, then Java has some nice frameworks that help you parse it up. If you're working with a proprietary text format, then you're pretty much stuck with rolling your own solution. I'm not really clear on your specific questions though. Maybe you could give an example of what you're trying to do, and what's not working?
Andrew Roberts
Greenhorn

Joined: Nov 11, 2011
Posts: 2
Thanks. It's proprietary - just source code for a webpage. However, it is unique (mostly). I've got it all done now. I just have to have a method for each item I want to find, match a section with a regex, then do a scanner class and string tokenizer to break things down and some if/then stuff to allow for small variances in the data. It was a pain, but it's working now. I was just hoping there was an easier way to do it, say grab the next x number of characters AFTER a regex match and then just work with that.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Need help to parse text files.
 
Similar Threads
Hibernate grabbing Data
Regular Expressions and String replacements
[newbie] regex anomaly
Mac OS X Tiger Unleashed by John Ray, William C. Ray
Problems with regular expressions