There are very few things that can be done with a PDF file after it has been created; extracting structured layout information is not one of them. Libraries like JPedal and PDFBox (linked on the AccessingFileFormats page) can extract the text contained in a PDF, but that's about as good as it gets. [ July 31, 2007: Message edited by: Ulf Dittmer ]
Ping & DNS - updated with new look and Ping home screen widget
Hi, You can convert html to pdf by following 1) convert html to xhtml 2) convert xhtml to XSLFO( Extensible Style Sheet-Formating Objects) using XSL style sheet and Translator 3)Pass XSL-FO doc to a formatter to generate PDF.