This week's book giveaways are in the Refactoring and Agile forums.
We're giving away four copies each of Re-engineering Legacy Software and Docker in Action and have the authors on-line!
See this thread and this one for details.
Win a copy of Re-engineering Legacy Software this week in the Refactoring forum
or Docker in Action in the Cloud/Virtualization forum!
    Bookmark Topic Watch Topic
  • New Topic

Converting .pdf to .html file

 
Jack pero
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
  • Report post to moderator
Hi,
Is there any way to convert .pdf file to .html file in Java.
If so, then please can anyone provide some code snippet for that, that would be a great help.

Thanks in advance,
Jack
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Report post to moderator
There are very few things that can be done with a PDF file after it has been created; extracting structured layout information is not one of them. Libraries like JPedal and PDFBox (linked on the AccessingFileFormats page) can extract the text contained in a PDF, but that's about as good as it gets.
[ July 31, 2007: Message edited by: Ulf Dittmer ]
 
Mohd. Irfan Khan
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
  • Report post to moderator
Hi,
You can convert html to pdf by following
1) convert html to xhtml
2) convert xhtml to XSLFO( Extensible Style Sheet-Formating Objects) using XSL style sheet and Translator
3)Pass XSL-FO doc to a formatter to generate PDF.

Hope fully this may be of help to you!
 
Ernest Friedman-Hill
author and iconoclast
Marshal
Pie
Posts: 24208
35
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Report post to moderator
Not so much; he wants to convert PDF to HTML, not HTML to PDF.
 
    Bookmark Topic Watch Topic
  • New Topic