Meaningless Drivel is fun!*
The moose likes Java in General and the fly likes Convert PDF files to Tiff files Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Murach's Java Servlets and JSP this week in the Servlets forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Convert PDF files to Tiff files" Watch "Convert PDF files to Tiff files" New topic
Author

Convert PDF files to Tiff files

Anup Bansal
Ranch Hand

Joined: Sep 12, 2006
Posts: 69
Hi,

I need to convert multipage PDF files to Single page Tiff files.
Can I achieve this using the JAI library?
What other ways can i realise this?

Thanks & regards,
Anup
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41034
    
  43
The http://pdfbox.apache.org/ library can create PNGs from PDFs. Then you can use the javax.imageio.ImageIO class to convert those to TIFFs (after TIFF-enabling ImageIO).


Ping & DNS - my free Android networking tools app
Anup Bansal
Ranch Hand

Joined: Sep 12, 2006
Posts: 69
The PDFBOx seems to me more like a command based utility solution than an API based solution.
Is there an API based solution to realise this?
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41034
    
  43
The shell script is just a thin wrapper around the Java library. Look for the PDFToImage class in the PDFBox source code, and it should be clear how it works.
Anup Bansal
Ranch Hand

Joined: Sep 12, 2006
Posts: 69
I get the following error when i use the PDFToImage class:
Throwable occurred: java.lang.NoClassDefFoundError: org.apache.fontbox.afm.FontMetric
at org.apache.pdfbox.pdmodel.font.PDFont.getAFM(PDFont.java:313)
at org.apache.pdfbox.pdmodel.font.PDFont.getFontWidthFromAFMFile(PDFont.java:262)
at org.apache.pdfbox.pdmodel.font.PDSimpleFont.getFontWidth(PDSimpleFont.java:175)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:323)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:106)
at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:698)
at org.apache.pdfbox.util.PDFImageWriter.writeImage(PDFImageWriter.java:137)
at com.abnarmo.nl.scan.pdfconvertor.split.PDFToImage.main(PDFToImage.java:204)
Caused by: java.lang.ClassNotFoundException: org.apache.fontbox.afm.FontMetric
at java.net.URLClassLoader.findClass(URLClassLoader.java:419)
at java.lang.ClassLoader.loadClass(ClassLoader.java:643)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:300)
at java.lang.ClassLoader.loadClass(ClassLoader.java:609)
... 12 more

It seems that the class, org.apache.fontbox.afm.FontMetric is not present. When I check the fontbox-0.1.0.jar, i cannot find this class. Insted i find the class as org.fontbox.afm.FontMetric. (without the apache in it)
I had downloaded this jar from http://pdfbox.apache.org/ . Where can I find the correct fontbox.jar?

Thanks & regards,
Anup
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41034
    
  43
I had downloaded this jar from http://pdfbox.apache.org/ . Where can I find the correct fontbox.jar?

At the same place. You may want to grab jempbox as well, just in case it's needed.
Anup Bansal
Ranch Hand

Joined: Sep 12, 2006
Posts: 69
Thanks got the correct version.
I would like to convert the PDF file to a tiff file. So as mentioned the first step is to convert it into PNG file and then using JAI convert the PNG into TIFF.

In the PDFBox, the PDFImageWriter calss is used to convert the PDF to the desired PNG file. However, the PNG file is created in the Filesystem.
As i need to convert this further to a TIFF file, I would like to know if it is possible to have a byte array of the PNG file without actually it being creatd in the FileSystem which I can use as an input for the ImageIO classes?
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41034
    
  43
... and then using JAI convert the PNG into TIFF.

You'd be using the TIFF-enabled ImageIO to create the TIFF. As I said, JAI is not involved.

In the PDFBox, the PDFImageWriter calss is used to convert the PDF to the desired PNG file. However, the PNG file is created in the Filesystem.

Is having an interim file an actual problem? ImageIO can read a PNG file in one line of code - it doesn't get much easier than that.

I would like to know if it is possible to have a byte array of the PNG file without actually it being creatd in the FileSystem which I can use as an input for the ImageIO classes?

It is possible, but you'll have to dig deeper into PDFBox and ImageIO. The solution would involve adapting PDFImageWriter.writeImage to obtain a MemoryCacheImageOutputStream in the ImageIO.createImageOutputStream call instead of a FileImageOutputStream. Unless you've ascertained (how?) that creating interim PNG files is an actual problem this is a lot of effort for little gain.
Anup Bansal
Ranch Hand

Joined: Sep 12, 2006
Posts: 69
I am using RAD 7.5 version. It does not contain the class com.sun.media.imageioimpl.plugins.tiff.TIFFImageWriterSpi.
Where can I find the jar file wih the plugin for TIFF?
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41034
    
  43
The link "TIFF-enabling ImageIO" I posted earlier explains that.
Anup Bansal
Ranch Hand

Joined: Sep 12, 2006
Posts: 69
I am able to convert PNG format to TIFF, howerver the TIFF file created is large.
How can i compress the TIFF file?

Following is a snippet of my Code:
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41034
    
  43
TIFFs will generally be larger than PNGs or JPEGs - they use less efficient compression. Is that an actual problem?

BTW, you shouldn't use PNGImageReaderSpi directly. This will do nicely (and work for all supported image formats, not just PNG):

Anup Bansal
Ranch Hand

Joined: Sep 12, 2006
Posts: 69
Ok thanks for your inputs!
I can use the approach as mentioned in the discussion thread to convert PDF to PNG and then to TIFF.
I want to convert a multipage PDF to TIFF.
What would be the best apporach -> to split the final TIFF or to first split PDF and then create the individual TIFFs.
Is there a better approach?
Also, at the following link I found a different approach (using Jpedal)
Are there any differences/downsides of using it?
Link using Jpedal
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Convert PDF files to Tiff files
 
Similar Threads
How to convert PDF file to an image files using java.
Java library for watermarking PDF, TIFF files?
how to convert existing pdf file into .tiff format
iText: pdf conversion into other formats
Converting Tiff to PDF