This week's book giveaway is in the Agile and other Processes forum. We're giving away four copies of The Mikado Method and have Ola Ellnestam and Daniel Brolund on-line! See this thread for details.
I want to convert, for example, http://www.google.com to PDF.
I found iText solutions, but i still haven't made them to work. Any experience with this? Any other suggestion?
All useful hints are welcome!
Regards
Ulf Dittmer
Marshal
Joined: Mar 22, 2005
Posts: 35243
7
posted
0
Do you mean for arbitrary HTML? That's tough. Any solution would probably start by changing the HTML into well-formed XML (using a library like TagSoup of NekoHTML), and then parsing that XML and creating the PDF as appropriate.
I'm new to Open Office.
Can you post more details about setup and Java code for Pdf generation, please?
I posted this on Open Office forum too.
Regards
Ulf Dittmer
Marshal
Joined: Mar 22, 2005
Posts: 35243
7
posted
0
Using OO in server mode is decidedly non-trivial - the API has a steep learning curve. But I believe that the JODConverter library makes the process significantly easier, so you may want to look into that first.
It doesn't support html extensions? Is there any useful hint/code for reworking JODConverter to support converting html pages to pdf? I'd pass the link, and expect page as a pdf document.
Regards
Darya Akbari
Ranch Hand
Joined: Aug 21, 2004
Posts: 1855
posted
0
iText is definitely the wrong API for what you want. Have you heard of DocBook XML and DocBook XSL? Give it a try, it not only converts html but a lot more.
SCJP, SCJD, SCWCD, SCBCD
Imre Tokai
Ranch Hand
Joined: Jun 04, 2008
Posts: 123
posted
0
Thank you Darya!
I haven't worked with DocBook XML and DocBook XSL yet.
I'm looking after examples.
JODConverter is working on Tomcat. That is exactly what I need.
If you have any hint that will speed-up my digging, post it please.
Regards
Darya Akbari
Ranch Hand
Joined: Aug 21, 2004
Posts: 1855
posted
0
DocBook XML and DocBook XSL allows you tech writing stuff like writing technical documents. When you have an HTML document you can transform this into DocBook XML and from there to PDF.
Imre Tokai
Ranch Hand
Joined: Jun 04, 2008
Posts: 123
posted
0
How can i assemble working example with DocBook XML and DocBook XSL?
In the meanwhile I found class:
org.w3c.tidy.Tidy
http://jtidy.sourceforge.net/apidocs/org/w3c/tidy/Tidy.html Tidy is based on XHTML. I set it up, but it's not working for all websites...
Any experience with this approach?
Any other way to create a generic HTML(url) to PDF, convertor?