This week's giveaway is in the Android forum.
We're giving away four copies of Android Security Essentials Live Lessons and have Godfrey Nolan on-line!
See this thread for details.
The moose likes JSP and the fly likes PDF to Html convertion in jsp using java Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login

Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » Java » JSP
Bookmark "PDF to Html convertion in jsp using java" Watch "PDF to Html convertion in jsp using java" New topic

PDF to Html convertion in jsp using java

Nazeer Ahammad
Ranch Hand

Joined: Feb 26, 2012
Posts: 43
Hi All,
I'm using below code to convert pdf file to Html. It was printing table content as string.
Example: suppose pdf has table content like below
| Header |
TD1 | TD2 | TD3 | TD4 |

If i use below jsp code.
I'm getting Output as like below

Header TD1 TD2 TD2 TD3 TD4

<%@page import="com.itextpdf.text.pdf.parser.PdfTextExtractor"%>
<%@page import="com.itextpdf.text.pdf.PdfReader"%>
<%@ page language="java" contentType="text/html; charset=ISO-8859-1"
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "">
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>View page</title>
<%! String page1;%>
<%! String[] pagescon; %>
<%! String pages="Nazeer\nAhammad\nDudekula"; %>

PdfReader reader = new PdfReader("D:/tablecontent.pdf");
System.out.println("This PDF has "+reader.getNumberOfPages()+" pages.");
PdfTextExtractor.getTextFromPage(reader, 1);
page1=PdfTextExtractor.getTextFromPage(reader, 1).replaceAll("\\s"," ");

for(int i=0;i<pagescon.length;i++)
<br> <%= pagescon[i]%>

<%} %>

please anyone give solution.

Thank you,
William Brogden
Author and all-around good cowpoke

Joined: Mar 22, 2000
Posts: 12761
Seems to me that if you want extracted strings to be presented in an HTML table, you will have to write the HTML formatting yourself.

I would never try to do this with embedded code in a JSP. Instead I would create a class that could be tested outside the JSP/servlet environment. Once you get it producing well formatted HTML then see about using it in JSP.

Paul Clapham

Joined: Oct 14, 2005
Posts: 18541

And it seems to me that if you use a class named PdfTextExtractor, it's only going to extract the text from the PDF. If the PDF contains formatting such as tables, it isn't going to tell you anything about that.
I agree. Here's the link:
subject: PDF to Html convertion in jsp using java
Similar Threads
include directive doesn't work
http error404
Could not find action or result
Error in request.getParameter
using 4 includes only shows bgcolor of first