This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
"chandra", The Java Ranch has thousands of visitors every week, many with surprisingly similar names. To avoid confusion we have a naming convention, described at http://www.javaranch.com/name.jsp . We require names to have at least two words, separated by a space, and strongly recommend that you use your full real name. Please choose a new name which meets the requirements. Thanks.
SAX is a stream-based parsing process. You can easily configure a SAX parse to grab certain tags and/or their contents, and ignore others. This makes it quick and effective for extracting specific information from potentially large XML sources. DOM is a document-based parsing process. A DOM parser reads a whole XML source into a large, complex internal structure and provides lots of operations to extract, insert and modify the loaded data. DOM is good for when you need to load and transform whole documents, or create new XML documents in memory ready for such a peocess.
I would like to add that because, as Frank already explained, DOM parses the complete XML document into a large data structure in memory, there are some pretty bad performance issues, especially when you are dealing with larger documents. I have to admit that I have not used SAX myself, but all my co-workers swear by it and I am going to use it in my future projects (if at all possible). Of course, there might be cases where the event-based SAX parser cannot be applied, but I suspect that it will work more often than not, especially if you have some control over the XML that you are processing. -Mirko
Originally posted by Frank Carver: SAX is a stream-based parsing process. You can easily configure a SAX parse to grab certain tags and/or their contents, and ignore others. This makes it quick and effective for extracting specific information from potentially large XML sources. DOM is a document-based parsing process. A DOM parser reads a whole XML source into a large, complex internal structure and provides lots of operations to extract, insert and modify the loaded data. DOM is good for when you need to load and transform whole documents, or create new XML documents in memory ready for such a peocess.
Another important different is that SAX is read-only while DOM is read-write. This means SAX doesnot have any facility using which one can change the content of the XML document being parsed. On the otherhand, DOM provides mutator methods to get/set XML data and the changes done during parsing are durable. Perhaps this is the one of the most compelling reason to consider DOM-parsing despite of its performance issues. Ofcourse one can always implement a hybrid-SAX that allows manipulating the XML content. It is not impossible, but may be redundant since DOM is available out there which can do the exact same thing! Hope that helps, ------------------ Ajith Kallambella M. Sun Certified Programmer for the Java2 Platform.
Open Group Certified Distinguished IT Architect. Open Group Certified Master IT Architect. Sun Certified Architect (SCEA).
Joined: Jan 07, 1999
Which approach to use also depends on the ultimate use for the data. If you are parsing the XML in order to build some application-specific data structure from it, then using DOM can result in even larger memory requirements, as you duplicate all the information in the DOM while building it. Don't forget though, that there is a difference between the DOM and a DOM. You can build your own lighter-weight Document Object Model, using SAX or any other simple parsing system. JDOM is an example of this; it's a full document model which is much easier to traverse in Java than using the official DOM interface.
1)SAX is faster than DOM. 2)SAX is good for large documents because it takes comparitively less memory than Dom. 3)SAX takes less time to read a document where as Dom takes more time. 4)With SAX we can access data but we can't modify data. 5)We can stop the SAX parsing when ever and where ever you want. 6)SAX is sequential parsing but with DOM we can move to back also. 7)To parse machine generated code SAX is better.To parse human readable documents DOM is useful.
[This message has been edited by Murali Mohan (edited June 15, 2001).]
I have an large XML file (1GB) and need to transform it to an another xml file. DOM based approch takes a lot time that's why I'm searching for a SAX based XSLT processor. Now every XSLT processor use SAX and DOM internally but during transformation most of them use DOM internally. Can any one give a XSLT parser name which can transform a big XML file and can give output a big xml file using SAX.
-----------------<br />IBM XML Developer <br />SCJP 1.4
Originally posted by Shabbir Rahman: Can any one give a XSLT parser name which can transform a big XML file and can give output a big xml file using SAX.
This is just guessing, but I don't think there is one that uses SAX. XSL transformation is a complex task (ever seen the XSL specification?), which requires the engine to move back and forth within the document due to even the simplest XPath expressions. In practice, the only viable way I can think of is to keep the whole document in memory (a la DOM).
Just how complex is the re-arrangement you are attempting? It seems to me that this is the basic question. If you have to compeletely change the hierarchy, its going to be a BIG memory intensive job. If you are just doing a few renamings, adding/removing attributes or changing limited areas of the hierarchy, then you might be able to write a custom SAX based utility or find a XSLT processor that won't use a lot of memory. Bill