File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes XML and Related Technologies and the fly likes Removing extra #text elements from DOM tree Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of The Java EE 7 Tutorial Volume 1 or Volume 2 this week in the Java EE forum
or jQuery UI in Action in the JavaScript forum!
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Removing extra #text elements from DOM tree" Watch "Removing extra #text elements from DOM tree" New topic
Author

Removing extra #text elements from DOM tree

Tony Walters
Ranch Hand

Joined: Feb 13, 2003
Posts: 54
Hiya
Bit of a newby question probably.
I have an xml doument which has been parsed by DOM. The document looks like this:
<catalog>
<book>
stuff here
</book>
</catalog>
The DOM tree appears to be as follows:
catalog
/ | \
#text book #text
I'm sure I have read that by inlcuding a DOCTYPE in the xml document, linking to a DTD, I can get rid of the unnecessary #text elements in the DOM tree, but this does not seem to work.
Any suggestions would be *very* much appreciated as I have already wasted a day on this! Doh!
Naren
Greenhorn

Joined: Jul 21, 2003
Posts: 23
There is a method named isIgnoringElementContentWhitespace() in the class DocumentBuilderFactory (package: javax.xml.parsers )
Set it to 'true' before you get an instance of DocumentBuilder from it.


<a href="http://www.mqtips.com" target="_blank" rel="nofollow">MQMessenger</a> - free Java app for MQSeries
Tony Walters
Ranch Hand

Joined: Feb 13, 2003
Posts: 54
Thanks for that, I'll give it a go.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Removing extra #text elements from DOM tree