File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes XML and Related Technologies and the fly likes Turn XSLT into HTML content Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Turn XSLT into HTML content" Watch "Turn XSLT into HTML content" New topic
Author

Turn XSLT into HTML content

Jeppe Sommer
Ranch Hand

Joined: Jan 07, 2004
Posts: 270
Hi.

I am using a servlet to transform a XML document into HTML using a XSLT. It works fine and the HTML page is shown correct. So far so good.

My problem is that if I want to save the HTML page locally (as a HTML file) from the browser (save as...) I am only getting the posibility to save the page as XML.

When I right click on the HTML page and choose view source, I see a XML-tags and Not a HTML-tags.

When I try to convert the URL using PD4ML (PD4ML is a PDF converter) it saves the XML-tags into the PDF instead of the HTML code.

What is wrong?

In the servlet I use:
response.setContentType("text/html; charset=utf-8");

In the XML file I use the declaration:
<xslutput method="html" indent="yes"/>

Any ideas what is wrong?

Please see code:
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

What is the actual content type of the page when you look at it in the browser?
Jeppe Sommer
Ranch Hand

Joined: Jan 07, 2004
Posts: 270
Paul Clapham wrote:What is the actual content type of the page when you look at it in the browser?


I see a HTML page. But how do I see the actual content type of the page? Please see link:
webpage
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

Sorry, I was misled into believing that your servlet was doing the XSL transformation. But after looking at that page I see that you're making the browser do the transformation.

Anyway when I used HttpFox in Firefox to look at the HTTP conversation, I see that your XML data is being sent with a content type of "application/xml". But when I use "View Page Info" on the result of the transformation, Firefox tells me that the content type is "text/html; charset=UTF-8" (twice with different capitalization) and it allows me to save it as HTML.

However when I use IE then (as you say) it doesn't allow me to save it as HTML. Using the built-in developer tools to look at the generated HTML shows that the content type is there twice, once as "text/html; charset=UTF-16" and once as "text/html; charset=utf-8".

Perhaps you should change your code so that you don't tell the browser you're sending HTML when you are really sending XML? That might be the source of the confusion. The xsl:output element is the right place to specify that kind of information. And by the way when I just looked up xsl:output to make sure I know what I was talking about, I noticed an obscure "media-type" attribute whose default is supposedly "text/xml". Perhaps that might be relevant? I only just stumbled over it so I know nothing except what I just read.

(And by the way your stylesheets aren't linked right so the browser isn't downloading them; but perhaps you already knew that and wanted to get the bigger problem solved first.)
Jeppe Sommer
Ranch Hand

Joined: Jan 07, 2004
Posts: 270
Thanks for the good input.

Paul Clapham wrote:Sorry, I was misled into believing that your servlet was doing the XSL transformation. But after looking at that page I see that you're making the browser do the transformation.

Anyway when I used HttpFox in Firefox to look at the HTTP conversation, I see that your XML data is being sent with a content type of "application/xml". But when I use "View Page Info" on the result of the transformation, Firefox tells me that the content type is "text/html; charset=UTF-8" (twice with different capitalization) and it allows me to save it as HTML.

However when I use IE then (as you say) it doesn't allow me to save it as HTML. Using the built-in developer tools to look at the generated HTML shows that the content type is there twice, once as "text/html; charset=UTF-16" and once as "text/html; charset=utf-8".

Perhaps you should change your code so that you don't tell the browser you're sending HTML when you are really sending XML? That might be the source of the confusion. The xsl:output element is the right place to specify that kind of information. And by the way when I just looked up xsl:output to make sure I know what I was talking about, I noticed an obscure "media-type" attribute whose default is supposedly "text/xml". Perhaps that might be relevant? I only just stumbled over it so I know nothing except what I just read.

(And by the way your stylesheets aren't linked right so the browser isn't downloading them; but perhaps you already knew that and wanted to get the bigger problem solved first.)


I made a few changes to the code:

1) The servlet content type is: response.setContentType("text/xml; charset=iso-8859-1");

2) The meta tag is removed from the XSL file and does only have the declaration:
<xsl:output method="html" encoding="iso-8859-1" indent="yes"/>

3) Where do you see the media-type attribute? In the stylesheet I import a CSS script:
<link rel="Stylesheet" type="text/css" href="ubl.css">

4) Why is the stylesheets linked the wrong way? Any tips? That is how I link the stylesheets in the servlet:

- and when looking at the source code it looks like:

Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

(3) I saw the media-type attribute in on-line documentation for the xsl:output element. Just a minute... yes, it's in the XSLT book on my shelf too. Neither I nor the book know what it's for.

(4) It's the CSS scripts which the browsers can't find.
Jeppe Sommer
Ranch Hand

Joined: Jan 07, 2004
Posts: 270
Yes you are right.

I removed the css link.
I added the media-type to the <xsl:output method... :
><xsl:output method="html" media-type="text/html" encoding="iso-8859-1" indent="yes"/>

But unfortunately I don' t see any changes other than the markup is correct now.

The browser source code is still xml code. When using IE and "save as..." we only got the chance to save as ".xml" in IE9 (not Firefox).

And the biggest problem is that when using pd4ml (the PDF generator api) the PDF file still shows XML tags only and not the HTML page.

How can we change the it so the source code in the browser is "HTML code" and not "XML code"?
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Turn XSLT into HTML content
 
Similar Threads
Could not load DTDDVFactoryImpl (xerces.jar)
XML output shown as plain text
View several XSL documents in one HTML page
Trouble in handling Internationalization
Trying to print XML using a servlet