• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Extracting a nested XML document

 
Ranch Hand
Posts: 462
Scala jQuery Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,
I have a bean that is consuming an xml document that contains another xml document and has all the < and > tags replaced by HTML tags - I'm told this is standard practice. Can anyone point me in the right direction as to extracting the inner xml document in the correct format? If I try to parse it using:



it works but I can't then access any of the values....

 
Ranch Hand
Posts: 2187
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The '<' and '>' characters are tag delimiters. What do you mean by "replaced by HTML tags"? Can you provide an example?

 
Will Myers
Ranch Hand
Posts: 462
Scala jQuery Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
< becomes the html tag & lt; and > becomes & gt;

I would post an example but this forum converts them to < and > and I don't kbnow how to escape them
 
Jimmy Clark
Ranch Hand
Posts: 2187
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks. & lt; and & gt; are HTML entities. They are not considered "tags".

Attempting to "nest" an XML document within another sounds like a bad design idea and conflicts with
the core premise of XML. The difficulty you are encountering is a result of poor design.
 
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Bad design! you got that right! For years I have been dealing with a client who got stuck with this design.

A CDATA section is used to hide a complete XML document text - to work with it I have to extract the entire CDATA section to a String, build an org.xml.sax.InputSource from the String and parse that to a DOM.

Then of course all of the normal org.w3c.dom and related methods work to access values.

Bill
 
Will Myers
Ranch Hand
Posts: 462
Scala jQuery Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Bad design but there's not much I can do about that. I have got round it by using XPath to extract the inner xml that I'm interested in then just replacing all the HTML entities with the xml ones then working on the result as normal, bit of a faff....
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic