Two Laptop Bag
The moose likes Java in General and the fly likes Converting .pdf to .html file Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login

Win a copy of Java Interview Guide this week in the Jobs Discussion forum!
JavaRanch » Java Forums » Java » Java in General
Reply locked New topic

Converting .pdf to .html file

Jack pero

Joined: Sep 09, 2005
Posts: 27
Is there any way to convert .pdf file to .html file in Java.
If so, then please can anyone provide some code snippet for that, that would be a great help.

Thanks in advance,
Ulf Dittmer

Joined: Mar 22, 2005
Posts: 42965
There are very few things that can be done with a PDF file after it has been created; extracting structured layout information is not one of them. Libraries like JPedal and PDFBox (linked on the AccessingFileFormats page) can extract the text contained in a PDF, but that's about as good as it gets.
[ July 31, 2007: Message edited by: Ulf Dittmer ]
Mohd. Irfan Khan

Joined: Jul 18, 2007
Posts: 18
You can convert html to pdf by following
1) convert html to xhtml
2) convert xhtml to XSLFO( Extensible Style Sheet-Formating Objects) using XSL style sheet and Translator
3)Pass XSL-FO doc to a formatter to generate PDF.

Hope fully this may be of help to you!

Thanks,<br />Mohd.Irfan Khan<br />SCJP1.4
Ernest Friedman-Hill
author and iconoclast

Joined: Jul 08, 2003
Posts: 24199

Not so much; he wants to convert PDF to HTML, not HTML to PDF.

[Jess in Action][AskingGoodQuestions]
I agree. Here's the link:
subject: Converting .pdf to .html file
jQuery in Action, 3rd edition