aspose file tools*
The moose likes XML and Related Technologies and the fly likes DOM and SAX Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "DOM and SAX" Watch "DOM and SAX" New topic
Author

DOM and SAX

chandra kolluri
Greenhorn

Joined: Sep 17, 2007
Posts: 7
How DOM is different from SAX? Detailed explanation is always welcome
Frank Carver
Sheriff

Joined: Jan 07, 1999
Posts: 6920
"chandra",
The Java Ranch has thousands of visitors every week, many with surprisingly similar names. To avoid confusion we have a naming convention, described at http://www.javaranch.com/name.jsp . We require names to have at least two words, separated by a space, and strongly recommend that you use your full real name. Please choose a new name which meets the requirements.
Thanks.

Read about me at frankcarver.me ~ Raspberry Alpha Omega ~ Frank's Punchbarrel Blog
Frank Carver
Sheriff

Joined: Jan 07, 1999
Posts: 6920
SAX is a stream-based parsing process. You can easily configure a SAX parse to grab certain tags and/or their contents, and ignore others. This makes it quick and effective for extracting specific information from potentially large XML sources.
DOM is a document-based parsing process. A DOM parser reads a whole XML source into a large, complex internal structure and provides lots of operations to extract, insert and modify the loaded data. DOM is good for when you need to load and transform whole documents, or create new XML documents in memory ready for such a peocess.
Mirko Froehlich
Ranch Hand

Joined: Aug 21, 2000
Posts: 114
I would like to add that because, as Frank already explained, DOM parses the complete XML document into a large data structure in memory, there are some pretty bad performance issues, especially when you are dealing with larger documents. I have to admit that I have not used SAX myself, but all my co-workers swear by it and I am going to use it in my future projects (if at all possible). Of course, there might be cases where the event-based SAX parser cannot be applied, but I suspect that it will work more often than not, especially if you have some control over the XML that you are processing.
-Mirko

Originally posted by Frank Carver:
SAX is a stream-based parsing process. You can easily configure a SAX parse to grab certain tags and/or their contents, and ignore others. This makes it quick and effective for extracting specific information from potentially large XML sources.
DOM is a document-based parsing process. A DOM parser reads a whole XML source into a large, complex internal structure and provides lots of operations to extract, insert and modify the loaded data. DOM is good for when you need to load and transform whole documents, or create new XML documents in memory ready for such a peocess.

Ajith Kallambella
Sheriff

Joined: Mar 17, 2000
Posts: 5782
Another important different is that SAX is read-only while DOM is read-write. This means SAX doesnot have any facility using which one can change the content of the XML document being parsed. On the otherhand, DOM provides mutator methods to get/set XML data and the changes done during parsing are durable.
Perhaps this is the one of the most compelling reason to consider DOM-parsing despite of its performance issues. Ofcourse one can always implement a hybrid-SAX that allows manipulating the XML content. It is not impossible, but may be redundant since DOM is available out there which can do the exact same thing!
Hope that helps,
------------------
Ajith Kallambella M.
Sun Certified Programmer for the Java2 Platform.


Open Group Certified Distinguished IT Architect. Open Group Certified Master IT Architect. Sun Certified Architect (SCEA).
Frank Carver
Sheriff

Joined: Jan 07, 1999
Posts: 6920
Which approach to use also depends on the ultimate use for the data. If you are parsing the XML in order to build some application-specific data structure from it, then using DOM can result in even larger memory requirements, as you duplicate all the information in the DOM while building it.
Don't forget though, that there is a difference between the DOM and a DOM. You can build your own lighter-weight Document Object Model, using SAX or any other simple parsing system. JDOM is an example of this; it's a full document model which is much easier to traverse in Java than using the official DOM interface.
Murali Mohan
Ranch Hand

Joined: Jun 14, 2001
Posts: 47
1)SAX is faster than DOM.
2)SAX is good for large documents because it takes comparitively less memory than Dom.
3)SAX takes less time to read a document where as Dom takes more time.
4)With SAX we can access data but we can't modify data.
5)We can stop the SAX parsing when ever and where ever you want.
6)SAX is sequential parsing but with DOM we can move to back also.
7)To parse machine generated code SAX is better.To parse human readable documents DOM is useful.

[This message has been edited by Murali Mohan (edited June 15, 2001).]
Shabbir Rahman
Greenhorn

Joined: Feb 18, 2002
Posts: 18
I have an large XML file (1GB) and need to transform it to an another xml file. DOM based approch takes a lot time that's why I'm searching for a SAX based XSLT processor. Now every XSLT processor use SAX and DOM internally but during transformation most of them use DOM internally. Can any one give a XSLT parser name which can transform a big XML file and can give output a big xml file using SAX.


-----------------<br />IBM XML Developer <br />SCJP 1.4
Pradeep bhatt
Ranch Hand

Joined: Feb 27, 2002
Posts: 8919

you are parsing the XML in order to build some application-specific data structure from it,

JAXB can be used.


Groovy
Lasse Koskela
author
Sheriff

Joined: Jan 23, 2002
Posts: 11962
    
    5
Originally posted by Shabbir Rahman:
Can any one give a XSLT parser name which can transform a big XML file and can give output a big xml file using SAX.
This is just guessing, but I don't think there is one that uses SAX. XSL transformation is a complex task (ever seen the XSL specification?), which requires the engine to move back and forth within the document due to even the simplest XPath expressions. In practice, the only viable way I can think of is to keep the whole document in memory (a la DOM).


Author of Test Driven (2007) and Effective Unit Testing (2013) [Blog] [HowToAskQuestionsOnJavaRanch]
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12823
    
    5
Just how complex is the re-arrangement you are attempting? It seems to me that this is the basic question.
If you have to compeletely change the hierarchy, its going to be a BIG memory intensive job. If you are just doing a few renamings, adding/removing attributes or changing limited areas of the hierarchy, then you might be able to write a custom SAX based utility or find a XSLT processor that won't use a lot of memory.
Bill
Shabbir Rahman
Greenhorn

Joined: Feb 18, 2002
Posts: 18
Apache Trax may be the solution, I will reply after testing it.
Ketan KC Chachad
Ranch Hand

Joined: Nov 23, 2003
Posts: 76
Can anyone provide me with a tutorial for SAX?


Regards,<br />Ketan KC Chachad
Lasse Koskela
author
Sheriff

Joined: Jan 23, 2002
Posts: 11962
    
    5
Originally posted by Ketan Chachad:
Can anyone provide me with a tutorial for SAX?

Go to http://www.ibm.com/developerworks and search for "Understanding SAX".
 
 
subject: DOM and SAX