jQuery in Action, 2nd edition*
The moose likes Other Open Source Projects and the fly likes How to convert HTML to PDF? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Products » Other Open Source Projects
Bookmark "How to convert HTML to PDF?" Watch "How to convert HTML to PDF?" New topic
Author

How to convert HTML to PDF?

Imre Tokai
Ranch Hand

Joined: Jun 04, 2008
Posts: 130
Hello!


I want to convert, for example, http://www.google.com to PDF.
I found iText solutions, but i still haven't made them to work. Any experience with this? Any other suggestion?

All useful hints are welcome!


Regards
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42625
    
  65
Do you mean for arbitrary HTML? That's tough. Any solution would probably start by changing the HTML into well-formed XML (using a library like TagSoup of NekoHTML), and then parsing that XML and creating the PDF as appropriate.

If it was CSS-styled XHTML you could use https://xhtmlrenderer.dev.java.net/


Ping & DNS - my free Android networking tools app
Imre Tokai
Ranch Hand

Joined: Jun 04, 2008
Posts: 130
Thank you for the reply, Ulf!


I mean conversion for generic HTML. I suppose that I'm not the first who needs this...
Any idea/solution?


Regards
Imre Tokai
Ranch Hand

Joined: Jun 04, 2008
Posts: 130
I found:
http://html-to-pdf-converter-free.software.informer.com/

How can I create application like this?


Regards

Joe Ess
Bartender

Joined: Oct 29, 2001
Posts: 8971
    
    9

I used the Open Office Java API to open HTML documents then export them as PDF's.


[How To Ask Questions On JavaRanch]
Imre Tokai
Ranch Hand

Joined: Jun 04, 2008
Posts: 130
Thank you Joe!


I'm new to Open Office.
Can you post more details about setup and Java code for Pdf generation, please?
I posted this on Open Office forum too.


Regards
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42625
    
  65
Using OO in server mode is decidedly non-trivial - the API has a steep learning curve. But I believe that the JODConverter library makes the process significantly easier, so you may want to look into that first.
Imre Tokai
Ranch Hand

Joined: Jun 04, 2008
Posts: 130
I found JODConverter on:
http://sourceforge.net/projects/jodconverter/
I started successfully the version for Tomcat.

It doesn't support html extensions? Is there any useful hint/code for reworking JODConverter to support converting html pages to pdf? I'd pass the link, and expect page as a pdf document.

Regards
Darya Akbari
Ranch Hand

Joined: Aug 21, 2004
Posts: 1855
iText is definitely the wrong API for what you want. Have you heard of DocBook XML and DocBook XSL? Give it a try, it not only converts html but a lot more.


SCJP, SCJD, SCWCD, SCBCD
Imre Tokai
Ranch Hand

Joined: Jun 04, 2008
Posts: 130
Thank you Darya!


I haven't worked with DocBook XML and DocBook XSL yet.
I'm looking after examples.

JODConverter is working on Tomcat. That is exactly what I need.
If you have any hint that will speed-up my digging, post it please.


Regards





Darya Akbari
Ranch Hand

Joined: Aug 21, 2004
Posts: 1855
DocBook XML and DocBook XSL allows you tech writing stuff like writing technical documents. When you have an HTML document you can transform this into DocBook XML and from there to PDF.
Imre Tokai
Ranch Hand

Joined: Jun 04, 2008
Posts: 130
How can i assemble working example with DocBook XML and DocBook XSL?

In the meanwhile I found class:
org.w3c.tidy.Tidy
http://jtidy.sourceforge.net/apidocs/org/w3c/tidy/Tidy.html
Tidy is based on XHTML. I set it up, but it's not working for all websites...
Any experience with this approach?
Any other way to create a generic HTML(url) to PDF, convertor?


Regards


Imre Tokai
Ranch Hand

Joined: Jun 04, 2008
Posts: 130
http://www.pdfonfly.com
is working for all the websites that i tried so far.

How to build this in Java?


Regards
Darya Akbari
Ranch Hand

Joined: Aug 21, 2004
Posts: 1855
Imre Tokai wrote:How can i assemble working example with DocBook XML and DocBook XSL?


http://www.docbook.org/ has everything you need to know.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42625
    
  65
Please BeForthrightWhenCrossPostingToOtherSites. It's the right thing to do.

http://forums.java.net/jive/thread.jspa?messageID=338033&tstart=0
Imre Tokai
Ranch Hand

Joined: Jun 04, 2008
Posts: 130
Another posting on other forum:

http://forums.sun.com/thread.jspa?threadID=5374819&tstart=0


Regards
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: How to convert HTML to PDF?