File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

which parser to use under these conditions

 
Gul Khan
Ranch Hand
Posts: 173
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi All,
Need your expert opinion with choosing the xml parser. Which one is the best in performance i.e. Fastest parsing for the docs ranging from 100 kb to 5 mb.

From what i have read Oracle parser seems to be the best with that.

Second question is Does anyone know If i use JDOM to keep my parser options open roughly how much will it drag my performance that i get using normal SAX.

Thanks in advance.

GUL
 
Lasse Koskela
author
Sheriff
Posts: 11962
5
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I would recommend Apache Xerces simply because it's the de facto standard XML parser implementation (well, perhaps after the built-in version that comes with J2SE 1.4) and you'll have easier time finding support compared to using, say, Oracle's XML parser.

In terms of performance, SAX will always have the theoretical advantage over DOM-style parsing because the DOM parser will parse the whole document into memory while a SAX parser only hands you events as it goes through the file.
 
Gul Khan
Ranch Hand
Posts: 173
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Lasse,
We have a requirement of using JAXP or JDOM as a wrapper to the actual parser. I know Oracle has an implementation of JAXP interfaces but does xerces also implements the same interfaces.

JDOM can use any parser, i m not sure if JAXP can work with how many parsers.
 
Kristof Camelbeke
Ranch Hand
Posts: 97
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
on the Xerces documentation it says that it supports JAXP 1.2

see xerces
 
clio katz
Ranch Hand
Posts: 101
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
xerces is jaxp compliant, as are most of the best (most widely used) parsers. jaxp compliance just assures you a 'known' parser interface ... what happens next is a matter of the specific parser implementation. for example, you can't guess what parser features a given parser implements - features vary widely across parsers.

to un-cart the proverbial horse ... as i understand it:

. jdom (since 2003) integrates jaxen for xpath support. jaxen requires jaxp.

. jaxp has been bundled into sun's java class lib since 1.4

...so you'll get jaxp under either scenario and the question becomes which parser impl best satisfies your specific requirements.

i, like Lasse, prefer Xerces for the reasons he stated. however, it seems more of a religious question among practitioners these days... you can pretty easily do some performance tests using representative documents. jaxp makes it simple for you to run comparison tests with various parsers:



hth
 
Gul Khan
Ranch Hand
Posts: 173
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks alot for all the info. That was great help.

GUL
 
I agree. Here's the link: http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic