File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Java in General and the fly likes Converting .pdf to .html file Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Reply locked New topic

Converting .pdf to .html file

Jack pero

Joined: Sep 09, 2005
Posts: 27
Is there any way to convert .pdf file to .html file in Java.
If so, then please can anyone provide some code snippet for that, that would be a great help.

Thanks in advance,
Ulf Dittmer

Joined: Mar 22, 2005
Posts: 42959
There are very few things that can be done with a PDF file after it has been created; extracting structured layout information is not one of them. Libraries like JPedal and PDFBox (linked on the AccessingFileFormats page) can extract the text contained in a PDF, but that's about as good as it gets.
[ July 31, 2007: Message edited by: Ulf Dittmer ]
Mohd. Irfan Khan

Joined: Jul 18, 2007
Posts: 18
You can convert html to pdf by following
1) convert html to xhtml
2) convert xhtml to XSLFO( Extensible Style Sheet-Formating Objects) using XSL style sheet and Translator
3)Pass XSL-FO doc to a formatter to generate PDF.

Hope fully this may be of help to you!

Thanks,<br />Mohd.Irfan Khan<br />SCJP1.4
Ernest Friedman-Hill
author and iconoclast

Joined: Jul 08, 2003
Posts: 24195

Not so much; he wants to convert PDF to HTML, not HTML to PDF.

[Jess in Action][AskingGoodQuestions]
I agree. Here's the link:
subject: Converting .pdf to .html file
It's not a secret anymore!