This week's book giveaway is in the Clojure forum.
We're giving away four copies of Clojure in Action and have Amit Rathore and Francis Avila on-line!
See this thread for details.
Win a copy of Clojure in Action this week in the Clojure forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Reading Very Big XML Files

 
pradeep selvaraj
Ranch Hand
Posts: 62
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi we are currently using JDom to read xml files. The file range from 10MB to 40MB. Most of the smaller ones (the 10Mb files) are read and populated in our applicaction at about 15 seconds. As the file size gets bigger, it takes unacceptable amount of time for the users and sometimes the application even hangs. Are there any other APIs that are good for reading big xml files, quickly?

Thanks in advance

Pradeep
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13048
6
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The key point you have not said anything about is what do you do with the file when it has been read in? If you have to manipulate the resulting DOM then why not use the DOM parser built in to Java since version 1.4?

Bill
 
pradeep selvaraj
Ranch Hand
Posts: 62
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes we do manipulate the data, but we feel that its a bit slower than we would like it. So are there any other tools available, which you think might be useful to read and manipulate very big xml files?

Thanks
Pradeep
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13048
6
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In the early days of XML there was a lot of interest in "lazy" parsers which would make a pass through a document locating the main elements but leave lower parts of the hierarchy in plain text form unless needed. I just did a Google search for "lazy xml parser" but most of the references were old.

Lately there has been interest in something called "Fast Infoset" - a loss-less binary representation of a XML document in a more easily parsed form. See for example, this page in the Glassfish project. Whether this is applicable to your problem, I don't know.

Bill
 
I agree. Here's the link: http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic