File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes XML and Related Technologies and the fly likes which parser to use under these conditions Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "which parser to use under these conditions" Watch "which parser to use under these conditions" New topic

which parser to use under these conditions

Gul Khan
Ranch Hand

Joined: Sep 03, 2003
Posts: 173
Hi All,
Need your expert opinion with choosing the xml parser. Which one is the best in performance i.e. Fastest parsing for the docs ranging from 100 kb to 5 mb.

From what i have read Oracle parser seems to be the best with that.

Second question is Does anyone know If i use JDOM to keep my parser options open roughly how much will it drag my performance that i get using normal SAX.

Thanks in advance.

Lasse Koskela

Joined: Jan 23, 2002
Posts: 11962
I would recommend Apache Xerces simply because it's the de facto standard XML parser implementation (well, perhaps after the built-in version that comes with J2SE 1.4) and you'll have easier time finding support compared to using, say, Oracle's XML parser.

In terms of performance, SAX will always have the theoretical advantage over DOM-style parsing because the DOM parser will parse the whole document into memory while a SAX parser only hands you events as it goes through the file.

Author of Test Driven (2007) and Effective Unit Testing (2013) [Blog] [HowToAskQuestionsOnJavaRanch]
Gul Khan
Ranch Hand

Joined: Sep 03, 2003
Posts: 173
Thanks Lasse,
We have a requirement of using JAXP or JDOM as a wrapper to the actual parser. I know Oracle has an implementation of JAXP interfaces but does xerces also implements the same interfaces.

JDOM can use any parser, i m not sure if JAXP can work with how many parsers.
Kristof Camelbeke
Ranch Hand

Joined: Nov 28, 2001
Posts: 97
on the Xerces documentation it says that it supports JAXP 1.2

see xerces
clio katz
Ranch Hand

Joined: Apr 30, 2004
Posts: 101
xerces is jaxp compliant, as are most of the best (most widely used) parsers. jaxp compliance just assures you a 'known' parser interface ... what happens next is a matter of the specific parser implementation. for example, you can't guess what parser features a given parser implements - features vary widely across parsers.

to un-cart the proverbial horse ... as i understand it:

. jdom (since 2003) integrates jaxen for xpath support. jaxen requires jaxp.

. jaxp has been bundled into sun's java class lib since 1.4 you'll get jaxp under either scenario and the question becomes which parser impl best satisfies your specific requirements.

i, like Lasse, prefer Xerces for the reasons he stated. however, it seems more of a religious question among practitioners these days... you can pretty easily do some performance tests using representative documents. jaxp makes it simple for you to run comparison tests with various parsers:

Gul Khan
Ranch Hand

Joined: Sep 03, 2003
Posts: 173
Thanks alot for all the info. That was great help.

I agree. Here's the link:
subject: which parser to use under these conditions
jQuery in Action, 3rd edition