| Author |
How to remove #text elements (white spaces) from a org.w3c.dom.Element
|
Max J. Power
Greenhorn
Joined: Oct 19, 2011
Posts: 2
|
|
Hello all,
I have a problem with the amount of child elements of a certain element. The goal is to count all child elements, but I'm getting also the Text elements because of the white spaces after the nodes:
<some_node>
<node1>text1</node1>
<node2>text1</node2>
<node3>text1</node3>
</some_node>
So for "someNode.getChildNodes().getLength()" I will get 7 instead of three.
What is the best practice here? I tried a loop with a counter which will be increased if an nodeof type Element will be found, which works, but aint there better ways? I also dont want to use XSLT as I will have then to maintain two different technologies.
Would it be better to remove the white spaces at the beginning of the code and then work with a "correct" element? If yes, what is the best way here?
Thank you!
Max J. Power
|
 |
Max J. Power
Greenhorn
Joined: Oct 19, 2011
Posts: 2
|
|
I think I found a solution. The following method will remove the white spaces from a Dom Node (a Node can be an Element or a Document):
After this transformation, the Node.getChildNodes().getLength() method returns the right value.
|
 |
Thomas Rochon
Ranch Hand
Joined: Jul 11, 2002
Posts: 72
|
|
|
Max, you are my hero ;)
|
 |
William Brogden
Author and all-around good cowpoke
Rancher
Joined: Mar 22, 2000
Posts: 12271
|
|
If your only need it to count Elements, your original loop approach with a test for Node type will certainly be faster since it does not require any additional object creation.
Bill
|
Java Resources at www.wbrogden.com
|
 |
g tsuji
Ranch Hand
Joined: Jan 18, 2011
Posts: 368
|
|
If those text nodes are that much hated, rather than using some helper to convert the element and byto array back-and-forth (which I find "ridiculous", sorry!), you can do that loop serving both purposes of cleaning them out once for all thereafter and counting the element nodes as child of it.
From there onward, the someNode will contain only element node as its child.
For this matter, I just want to point out in the DocumentBuilderFactory, there is a settable feature on IgnoringElementContentWhitespace. Unfortunately, there is some divided interpretation on the recommendations or some grossly overlook in the building of jdk 1.6, there is a regression of bug/feature in 1.6 to before 1.5. But, that's how it is for the moment.
|
 |
 |
|
|
subject: How to remove #text elements (white spaces) from a org.w3c.dom.Element
|
|
|