This week's book giveaway is in the OO, Patterns, UML and Refactoring forum.
We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line!
See this thread for details.
The moose likes XML and Related Technologies and the fly likes documentBuilderFactory parse problem Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "documentBuilderFactory parse problem" Watch "documentBuilderFactory parse problem" New topic
Author

documentBuilderFactory parse problem

R Ang
Greenhorn

Joined: Aug 29, 2008
Posts: 2
I am trying to create an XML document using the JDK 1.5 DocumentBuilderFactory class. I am using the following statement to read the sample XML string below and create an XML document.
The problem I'm having is with the way the input document is parsed. The FIELD2 element in the output document is an empty element. I need the spaces from the input document to remain in the output.
I checked the documentBuilderFactory.isIgnoringElementContentWhitespace() and it is set to FALSE (default).
How can I parse this document and retain elements with blank text values?

Thanks


CODE:

Document resultDocument = documentBuilder.parse(new InputSource(
new BufferedReader(new InputStreamReader(
new ByteArrayInputStream(inputXMLString.getBytes())))));


INPUT XML:

<?xml version="1.0" encoding="UTF-8"?>
<MESSAGE >
<MESSAGE_DATA>
<FIELD1>SOME TEXT</FIELD1>
<FIELD2> </FIELD2>
<FIELD3>SOME TEXT</FIELD3>
</MESSAGE_DATA>
</MESSAGE>


OUTPUT XML:

<?xml version="1.0" encoding="UTF-8"?>
<MESSAGE >
<MESSAGE_DATA>
<FIELD1>SOME TEXT</FIELD1>
<FIELD2/>
<FIELD3>SOME TEXT</FIELD3>
</MESSAGE_DATA>
</MESSAGE>
Paul Clapham
Sheriff

Joined: Oct 14, 2005
Posts: 19728
    
  10

Since all the other white space in your document is being preserved, my guess is that there actually is no space between the FIELD2 start tag and end tag.
R Ang
Greenhorn

Joined: Aug 29, 2008
Posts: 2
No a space actually does exist. This is just an example but there are other documents with more than just one space and they all result in empty elements. To confirm this I have printed the string containing the XML before parsing and the spaces are there.
Paul Clapham
Sheriff

Joined: Oct 14, 2005
Posts: 19728
    
  10

Here's my code:

It doesn't do what your code does, instead it preserves the space.

(I removed a lot of the excess baggage in the part of the code which creates
the DOM, but that shouldn't make any difference.)
 
I’ve looked at a lot of different solutions, and in my humble opinion Aspose is the way to go. Here’s the link: http://aspose.com
 
subject: documentBuilderFactory parse problem
 
It's not a secret anymore!