File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes XML and Related Technologies and the fly likes XML parsing VS simple TXT parsing using java streams Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "XML parsing VS simple TXT parsing using java streams" Watch "XML parsing VS simple TXT parsing using java streams" New topic

XML parsing VS simple TXT parsing using java streams

john wesley
Ranch Hand

Joined: Feb 14, 2005
Posts: 47
I have a situation here, I am currently storing huge amounts of data (half GB, one GB at most 2 GB) in text file (csv style)�and then I parse them using simple java streams �I would read them once and calculate some summaries and fill in oracle table. However, now I am thinking of storing the data in XML format instead of text format and use SAX for parsing , the file size would surely shoot up ..maybe double �.but more important is parsing performance �is XML suited for this amount of data ?? will SAX parsing be any better than simply reading text file using java streams and tokenizing them ??

Can some one please throw some light on this issue


"Let the one among you who has never sinned throw the first stone.." -A Hero
Arun Prasath
Ranch Hand

Joined: Sep 17, 2003
Posts: 192
Yes, you can write huge amounts of XML but not by using SAX or DOM. but by using SAX extensions that are available.
I would suggest you to read this article
This is a good one that talks about that.
Hope it helps..

SCJP 1.4, SCDJWS , SCJA<br />I can do ALL things through CHRIST who strengthens me.
William Brogden
Author and all-around good cowpoke

Joined: Mar 22, 2000
Posts: 13036
Since XML parsing will add LOTS of overhead I can't imagine how you could avoid a major slowdown. Any XML processing will involve creation of lots of objects, conversion to and from String etc.
IF (big if) your data is all ASCII, you will be much faster handling the input as byte streams and byte[] buffers, not character streams and staying well away from String conversion until the last minute.
XML shines when the data structure is complex, anything that can be represented as CSV is not a good candidate.
I agree. Here's the link:
subject: XML parsing VS simple TXT parsing using java streams
It's not a secret anymore!