File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes Reading a complex file Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Reading a complex file" Watch "Reading a complex file" New topic
Author

Reading a complex file

amit bose
Greenhorn

Joined: Apr 01, 2005
Posts: 25
Hi,

I have a file(*.txt) which is of the format:

<tag1>content1</tag1>
<tag2>content2</tag2>
...Etc till...
<tagN>contentN</tagN>
{1 : Data1}{2 : Data2..
...Etc till....
}

What would be a optimal way of reading the same?
Would a plain buffered stream read suffice OR a better alternative exist.

Cheers,
Amit
[ December 16, 2006: Message edited by: amit bose ]
Rahul Bhattacharjee
Ranch Hand

Joined: Nov 29, 2005
Posts: 2308
Hi Amit,

Does the file end with
{1 : Data1}{2 : Data2..
...Etc till....
}

Or , this is what you want..

Tag 1 =-> value : content1 like this.
You file format seems more like an xml..then why not use an xml parser..


Rahul Bhattacharjee
LinkedIn - Blog
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 39548
    
  27
I'd probably create a lexer (using JFlex), because these little custom file formats have a tendency to become more complex over time, which makes a hand-coded parser harder and more error-prone to maintain.


Ping & DNS - updated with new look and Ping home screen widget
amit bose
Greenhorn

Joined: Apr 01, 2005
Posts: 25
My file input is not a XML file.Only that it has some header info present as XML Tags.
What I require to extract from this file is
content1,content2,........contentn (and)
Data1, Data2,......,Datan

So should I use Regex for the same, not sure about the Regex perfomance given that my file size would not exceed say 100 lines. However, the bulk of the input files would be quite enormous.
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Well, Ulf's advice still seems pretty good. But regexes would work too. I think it's too early to worry about imaginary performace problems here - try it and see. Chances are good that the time it takes to read the file will be greater than the time necessary to parse it.

[amit]: ...given that my file size would not exceed say 100 lines. However, the bulk of the input files would be quite enormous.

That didn't really make sense to me. Are you saying that 100 lines is enormous? Are some of the lines extremely long? Are there many, many files? Or something else?


"I'm not back." - Bill Harding, Twister
amit bose
Greenhorn

Joined: Apr 01, 2005
Posts: 25
Hi Jim,

By that line I meant that the number of such files would be large.
About a certain thousand can be safely asssumed for now. The bulk is sure to go up in future.

Cheers,
Amit
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Reading a complex file
 
Similar Threads
Doubt in HeadFirst question
How to read XML data
Fundamental Question of XML
Query relating to transforming XML using XSL
DOM object from StringBuffer