File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes XML and Related Technologies and the fly likes DOM parser Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "DOM parser" Watch "DOM parser" New topic

DOM parser

Venkata Pavan Kumar vemuri

Joined: Feb 01, 2010
Posts: 14
I have a question.....can anybody please help me by giving information about the following

DOM parser takes the XML file to be parsed into memory and will create a tree structure based on the elements in the XML file but I want to know whether the tree structure which is build or the placing the document in memory is done before parsing or during parsing.

If the DOM parser is validating an XML file against a schema will it be placing the schema file also in memory? or what is the flow of program?

Thanks in Advance
William Brogden
Author and all-around good cowpoke

Joined: Mar 22, 2000
Posts: 13036
There really is no such thing as a "DOM parser" if you want to get picky.

A typical DOM builder uses an SAX parser to read the document and generate events corresponding to the parts of the document. It builds the DOM structure from these events. Among the many advantages, the source text is read as a stream and does not have to be all in memory at one time, also one well verified SAX parser can be used in a variety of ways.

Yes I know that some people have created other approaches which are "lazy" and don't parse some text chunks until needed but the mainstream works as above.

Paul Clapham

Joined: Oct 14, 2005
Posts: 19973

Well, yeah. Think about it. The parser sees a <banana> element. It is supposed to be validating, so it must determine whether that is a valid element and whether it can appear at that location and so on. So, where does it get that information? It gets it from the schema, right? So, where is the schema at that point in time? It's in the obvious place where a computer program would get information from. It's in memory. Where else should it be?

Now of course it's not in memory in its raw form. The parser will also have parsed the schema into some more convenient internal form. But the parser is going to store that internal form in memory, since it needs to refer to it frequently throughout the course of its processing.
I agree. Here's the link:
subject: DOM parser
It's not a secret anymore!