Sounds like a question about JBoss to me. Avihai, should I move this post to the JBoss forum or would you like to expand on your question? Right now I don't really see anything that can be answered.
avihai marchiano
Ranch Hand
Joined: Jan 10, 2007
Posts: 342
posted
0
Lets ignore Jboss.
What is the best dom parser?
thank you
Rahul Bhattacharjee
Ranch Hand
Joined: Nov 29, 2005
Posts: 2300
posted
0
Originally posted by avihai marchiano:
What is the best dom parser?
This is the way I would take this if someone asks me what is the best DOM parser.
As you might be knowing that DOM construts an in memory tree model of the XML and its size is must more than the size of the example.So I would give weightage to the Parser which consumes lesser memory.
Second thing to look at how much time does the parser takes to parse the xml into a in memory tree model.That would be another thing to look at.
Another think to look at would be to check how much time its taking and comparing with the size of documents. Though I have never done the above analysis , but doing so might be the solution to your question.
Do you have reason to believe that the default implementation used by javax.xml.parsers.DocumentBuilder won't do? If so, in what way is it insufficient?
Note how the throughput "winner" varies according to document size.
Bill (Have you considered using SAX parsing with a custom approach to saving the data in Java classes? Building a DOM involves a LOT of object creation you may be able to avoid. Does your program require a DOM for data manipulation?) [ September 19, 2007: Message edited by: William Brogden ]
Can you please explain your last comment about consider sax even if you need the whole document.
As far as i know if you need the whole document you should prefer dom.
i need only to read data.
Ulf Dittmer
Marshal
Joined: Mar 22, 2005
Posts: 35232
7
posted
0
As far as i know if you need the whole document you should prefer dom.
I wouldn't say prefer. If you need access to the whole document at the same time, then DOM is probably the way to go. But SAX will present the whole document as well, but in a sequential manner. So if you're processing (say) the 5th <foobar> element, and the code suddenly thinks "uh oh, I need the value of the 3rd <foobar> element, and I didn't save it when it was parsed", then you're out of luck. But if upon reading the 3rd element you already know that you're going to need it later, then you can store it somewhere, and access it later.
i need only to read data.
That's actually a strong reason to prefer SAX, because DOM does a good many things (and uses quite a bit of memory) to set things up so that you can change and save the document.
avihai marchiano
Ranch Hand
Joined: Jan 10, 2007
Posts: 342
posted
0
Thank you.
I certainly agree.
William Brogden
Author and all-around good cowpoke
Rancher
Joined: Mar 22, 2000
Posts: 12267
1
posted
0
Can you please explain your last comment about consider sax even if you need the whole document.
I will give it a try. Suppose your XML document represents a collection of books with the data for each one inside a <book> element. Each starting book tag contains some attributes you want to keep and there are additional elements with various bits of data. We are going to define a book class where each instance represents all the data inside one <book> element so the collection of instances represents the usable data from the document.
In your custom SAX event handler you do this:
1. When a startElement event for "book" occurs, create a new book object, passing the constructor the "Attributes" - keep a reference to the new object as your working object. 2. For each subsequent event, keep track of the current element and/or pass the text data you need to keep to some method in the working book object. (Remember that characters() events may contain only part of the data for a Text node.) 3. When you get the endElement event for "book" that object is complete - add the reference to some collection.
This saves all the object creation that would go into a DOM and lets you skip data you don't need for a particular application.
Bill
I agree. Here's the link: http://ej-technologies/jprofiler - if it wasn't for jprofiler, we would need to
run our stuff on 16 servers instead of 3.