Win a copy of Design for the Mind this week in the Design forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

How to remove #text elements (white spaces) from a org.w3c.dom.Element

 
Max J. Power
Greenhorn
Posts: 2
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello all,

I have a problem with the amount of child elements of a certain element. The goal is to count all child elements, but I'm getting also the Text elements because of the white spaces after the nodes:

<some_node>
<node1>text1</node1>
<node2>text1</node2>
<node3>text1</node3>
</some_node>

So for "someNode.getChildNodes().getLength()" I will get 7 instead of three.

What is the best practice here? I tried a loop with a counter which will be increased if an nodeof type Element will be found, which works, but aint there better ways? I also dont want to use XSLT as I will have then to maintain two different technologies.
Would it be better to remove the white spaces at the beginning of the code and then work with a "correct" element? If yes, what is the best way here?

Thank you!

Max J. Power
 
Max J. Power
Greenhorn
Posts: 2
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I think I found a solution. The following method will remove the white spaces from a Dom Node (a Node can be an Element or a Document):



After this transformation, the Node.getChildNodes().getLength() method returns the right value.
 
Thomas Rochon
Ranch Hand
Posts: 72
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Max, you are my hero ;)
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13058
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If your only need it to count Elements, your original loop approach with a test for Node type will certainly be faster since it does not require any additional object creation.

Bill




 
g tsuji
Ranch Hand
Posts: 656
3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If those text nodes are that much hated, rather than using some helper to convert the element and byto array back-and-forth (which I find "ridiculous", sorry!), you can do that loop serving both purposes of cleaning them out once for all thereafter and counting the element nodes as child of it.

From there onward, the someNode will contain only element node as its child.

For this matter, I just want to point out in the DocumentBuilderFactory, there is a settable feature on IgnoringElementContentWhitespace. Unfortunately, there is some divided interpretation on the recommendations or some grossly overlook in the building of jdk 1.6, there is a regression of bug/feature in 1.6 to before 1.5. But, that's how it is for the moment.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic