• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Tim Cooke
  • Paul Clapham
  • Jeanne Boyarsky
Sheriffs:
  • Ron McLeod
  • Frank Carver
  • Junilu Lacar
Saloon Keepers:
  • Stephan van Hulst
  • Tim Moores
  • Tim Holloway
  • Al Hobbs
  • Carey Brown
Bartenders:
  • Piet Souris
  • Frits Walraven
  • fred rosenberger

Page Count in PDF document

 
Greenhorn
Posts: 20
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have a requirement in which I need to loop through a directory containing PDF files and find the number of pages in each PDF. Any thoughts?

I know we can use class files in iText jar file but I wanted to know if anyone has better idea.

Thanks in advance
 
Rancher
Posts: 43028
76
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
As you said, iText can give you the result you're looking for. What would a "better" result look like?
 
P Igor
Greenhorn
Posts: 20
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Is iText the only options?

The "better" way is to use the existing Java SE API to do what iText does
 
Marshal
Posts: 27371
88
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
But iText does use the existing Java API to do that. You could do the same thing if you wanted to spend the time and write an implementation of the PDF spec, but that's a non-trivial exercise. Even implementing just the part of the PDF spec that you need is non-trivial. And iText has the advantage that it has already been tested by a lot of people on a lot of different PDF files. It's a tremendous value considering that you pay absolutely nothing for it. That's a lot of "betters" in my opinion.
 
P Igor
Greenhorn
Posts: 20
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank you Paul and Ulf.
iText is free? or we need to have some license purchased?
 
P Igor
Greenhorn
Posts: 20
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
And what about PDFBox? Anyone used that? How is it compared to iText?
 
P Igor
Greenhorn
Posts: 20
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Here is a simple code to find the page number in PDF. But I am getting a exception. Anyone come across this? How do I resolve?



Exception:
ava.io.IOException: PMBOK.pdf not found as file or resource.
at com.lowagie.text.pdf.RandomAccessFileOrArray.<init>(Unknown Source)
at com.lowagie.text.pdf.RandomAccessFileOrArray.<init>(Unknown Source)
at com.lowagie.text.pdf.PRTokeniser.<init>(Unknown Source)
at com.lowagie.text.pdf.PdfReader.<init>(Unknown Source)
at com.lowagie.text.pdf.PdfReader.<init>(Unknown Source)
at test.CreateIndexFiles.main(CreateIndexFiles.java:37)

I have Acrobat 8 installed in my machine.

Thank you.
 
Marshal
Posts: 76452
366
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Go through the usual procedure for any Exception: find where it occurred, using the line number (37) and see what you are doing there.
Find out what is going wrong; where is the file you are supposed to be looking for? If the file isn't there, or if you are trying to read a write-only file, or if another application has opened it, so the file is unavailable, you will get an Exception.

Try sorting the file out and see what happens.
 
P Igor
Greenhorn
Posts: 20
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank you.

Working now after I used following line of code:



Please note that this wont work if the file is encrypted
 
Campbell Ritchie
Marshal
Posts: 76452
366
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well done getting it to work. Obviously you would have to decrypt any files before analysing them.
 
Greenhorn
Posts: 26
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hi all,

while running above program i am getting the following error:


Exception in thread "main" java.lang.OutOfMemoryError: Java heap space


Note :

my pdf file size is more than 1.5 GB. if it's lesser size pdf then there is no problem.


please any one help me to resolve this problem.
 
Sheriff
Posts: 22684
128
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You can try to use the -Xmx flag of the JVM; execute "java -X" to see more information. If that still will not allow you to open the file (which is quite understandable, the file is just so freakingly large) then you will need to find a different PDF library that does not store everything in memory but uses a more stream-like way of handling. Compare this to XML parsing using DOM (the entire document is stored in memory) versus SAX (only a small part is in memory at any given time, and events are fired for each part).

But maybe you should ask yourself why a PDF file needs to be 1.5GB. I have very few files on my system that are over 1GB. Most are CD/DVD images that I've ripped for backups / faster installation or movies. Surely nothing like a Word or PDF file; the largest PDF file I have is a 580 page installation guide that is just 5.5MB. Not even remotely as large as your PDF file.
 
Poop goes in a willow feeder. Wipe with this tiny ad:
the value of filler advertising in 2021
https://coderanch.com/t/730886/filler-advertising
reply
    Bookmark Topic Watch Topic
  • New Topic