Win a copy of Mesos in Action this week in the Cloud/Virtualizaton forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

removing all element names from CDATA section

 
Vijay Chouhan
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Ranchers,

I have the following xml element.



I need to get the text content in the CDATA section removing all the tags i.e., the output should be



I tried the following script.



However this script converts all the angel brackets in the CDTA section to corresponding entities and copies them to the output along with the tag names. The output from the above script looks like the following.



Is there anyway I can obtain only text content and remove all tags from the input?

I hope my question makes sense.

Thanks,
Vijay
[ January 17, 2007: Message edited by: Vijay Chouhan ]
 
Paul Clapham
Sheriff
Posts: 21107
32
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Why? Using CDATA specifically says "This is not XML markup, this is just text. Do not treat it as markup." But you're saying you want to treat it as markup anyway?

Well, okay, you can do that. But you can't assume that it's going to be well-formed XML markup. It might have some < and > characters but you can't assume they will appear in pairs. But you could write some code that copied the text into a new location, but stopped copying it when you hit < and started again after you hit >.

But I'm still asking why you want to do that. I suspect there's some misunderstanding going on.
 
Vijay Chouhan
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for replying. You are absolutely right in stating that this data should not have been enclosed in CDATA sections. However the XML data is written by a different application and it is tough asking them to change their code. We however do have a guarantee that the contents are going to be well-formed markup.

But you could write some code that copied the text into a new location, but stopped copying it when you hit < and started again after you hit >.


This is exactly what I need to do. I have no idea on how to accomplish this in XSLT. Is there an XSLT function which allows me to do search and replace by specifying regular expressions.
 
Paul Clapham
Sheriff
Posts: 21107
32
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You want to do this in XSLT? And all you want to do is remove the tags? Then you can write a template to do that. Here's pseudo-XSLT for the template, which would have a parameter containing the string to be de-tagged:It's a recursive template, this is quite a common technique in declarative languages like XSLT.
 
Vijay Chouhan
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks a ton for the suggestion. It solved my problem. :-)
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic