aspose file tools*
The moose likes XML and Related Technologies and the fly likes How to remove #text elements (white spaces) from a org.w3c.dom.Element Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "How to remove #text elements (white spaces) from a org.w3c.dom.Element" Watch "How to remove #text elements (white spaces) from a org.w3c.dom.Element" New topic
Author

How to remove #text elements (white spaces) from a org.w3c.dom.Element

Max J. Power
Greenhorn

Joined: Oct 19, 2011
Posts: 2
Hello all,

I have a problem with the amount of child elements of a certain element. The goal is to count all child elements, but I'm getting also the Text elements because of the white spaces after the nodes:

<some_node>
<node1>text1</node1>
<node2>text1</node2>
<node3>text1</node3>
</some_node>

So for "someNode.getChildNodes().getLength()" I will get 7 instead of three.

What is the best practice here? I tried a loop with a counter which will be increased if an nodeof type Element will be found, which works, but aint there better ways? I also dont want to use XSLT as I will have then to maintain two different technologies.
Would it be better to remove the white spaces at the beginning of the code and then work with a "correct" element? If yes, what is the best way here?

Thank you!

Max J. Power
Max J. Power
Greenhorn

Joined: Oct 19, 2011
Posts: 2
I think I found a solution. The following method will remove the white spaces from a Dom Node (a Node can be an Element or a Document):



After this transformation, the Node.getChildNodes().getLength() method returns the right value.
Thomas Rochon
Ranch Hand

Joined: Jul 11, 2002
Posts: 72
Max, you are my hero ;)
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12803
    
    5
If your only need it to count Elements, your original loop approach with a test for Node type will certainly be faster since it does not require any additional object creation.

Bill




g tsuji
Ranch Hand

Joined: Jan 18, 2011
Posts: 535
    
    3
If those text nodes are that much hated, rather than using some helper to convert the element and byto array back-and-forth (which I find "ridiculous", sorry!), you can do that loop serving both purposes of cleaning them out once for all thereafter and counting the element nodes as child of it.

From there onward, the someNode will contain only element node as its child.

For this matter, I just want to point out in the DocumentBuilderFactory, there is a settable feature on IgnoringElementContentWhitespace. Unfortunately, there is some divided interpretation on the recommendations or some grossly overlook in the building of jdk 1.6, there is a regression of bug/feature in 1.6 to before 1.5. But, that's how it is for the moment.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: How to remove #text elements (white spaces) from a org.w3c.dom.Element