• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

How to handle nested html tags in XSL?

 
Ranch Hand
Posts: 89
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am trying to convert HTML havinf nested tags to XML using XSLT.
My HTML looks like

The outer span contains an inner span. I get the inner span using foreach in outer one but unable to process the left over text i.e. �outer continues�.

Is there any method to get left over text so I can convert into something like this


Thanks,
Kapil
 
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If you're using normal XSLT methods, then text nodes are copied to the output by default. But you're using the procedural xsl:for-each instead of recursively processing the nodes of the XML tree, so you lose that feature.

If you want to convert span elements to richtext elements and keep the rest of the document unchanged, then start with an identity transformation and add the following template to convert span to richtext:Then the xsl:apply-templates element will automatically copy all the text and attributes below the span element.
 
kapil Gupta
Ranch Hand
Posts: 89
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks for your reply Paul.
I tried the code that you had suggested but it resulted in output of form
but I want to close the outer richtext when inner starts and again create a new outer richtext when inner closes. Basically I dont want hierarchy in generated XML.

Thanks,
Kapil
[ February 11, 2007: Message edited by: kapil Gupta ]
 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Okay. So you don't just want to convert span elements to richtext elements. Then you will have to express what you do want to do instead.

Requirements that talk about start tags and end tags are hard to implement in XSLT. You need a requirement that talks about elements and nodes. If you have a text node inside a span element, what do you want the result to look like?
 
kapil Gupta
Ranch Hand
Posts: 89
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Am sorry for not writing my requirements clearly. Will try to make it clearer by an example. As I had mentioned in my first post that i want to convert html to xml and html is in the form of
.
Now I want it to convert to XML in the form of

Basically converting div to paragraph tag and span to richtext.
The only change in XML is that richtext tag does not contain another richtext like spans. As soon as a nested span comes I want to close the outer richtext and start a new richtext for inner and then open a richtext after inner one closes.
Thanks,
Kapil
 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

As soon as a nested span comes I want to close the outer richtext and start a new richtext for inner and then open a richtext after inner one closes.

As I said, requirements like this are extremely difficult to implement in XSLT. You will not get anywhere until you rephrase that in terms of elements and nodes.

Let's try this as requirements: 1. If a text node is a descendant of one or more span elements, it should be replaced in the output tree by a richtext element containing only that text node. 2. A span element should be replaced in the output tree by its text descendants with requirement 1 applied.

This translates into XSLT asLet's see if that works for a start.
 
kapil Gupta
Ranch Hand
Posts: 89
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I was able to generate the XML in required format after applying the transformation as suggested by you.
Thanks for helping me out Paul.
Kapil
 
kapil Gupta
Ranch Hand
Posts: 89
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
With the addition of new requirements, I have to handle some more html tags like bold, italic and under line. The html is in the form of:

I would like to convert it to the form

Basically richtext starts as soon as a html tag is found.
Am using following XSL to transform it


This code adds only one attribute to richtext tag i.e. it adds italic attribute to richtext but does not add bold attribute which is applied before italic in same span.
 
reply
    Bookmark Topic Watch Topic
  • New Topic