• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Selecting text without a node with XPath

 
Tim Patton
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm trying to figure out how to select some data out of HTML that looks like this:

<p>
<b>aaaa</b>
<i>bbbbb</i>
cccccc
</p>

Getting 'aaaa' and 'bbbb' are easy, but how to I select 'cccc' independent of the others? Somehow I want to say "grab all the text after the close of <i> up to </p>" I can't figure out how to get this sort of "free text" out with xpath.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13064
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, that text is in a node - a TEXT_NODE - so if you get the xpath expression to return a NODESET (same as Java's NodeList) for the <p> Element children, you can iterate through it till you find the Node with type TEXT_NODE you can take the value of that node.

Bill
 
Tim Patton
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I actually foigured it out, the Xpath to use would be something like:

//p/text()

If there is more than one block of text this also works:

//p/text()[1]
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic