It's not a secret anymore!
The moose likes Java in General and the fly likes PDF Text Content extraction using iText5.0.5 Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login

Win a copy of REST with Spring (video course) this week in the Spring forum!
JavaRanch » Java Forums » Java » Java in General
Reply locked New topic

PDF Text Content extraction using iText5.0.5

Divya Kambhatla

Joined: Jan 25, 2011
Posts: 13

I want to extract the text out of a PDF using iText5.0.5. The problem is when i extract text, all the text,including page numbers, figure titles, pae titles get extracted. I am completely new to the iText api. Could anyone please let me know if there is any method/interface in iText which could help extract ONLY the text content (or) atleast let me know how i could identify if the page numbers, page titles, figure titles also come under as page text?

Thanks in advance!
Paul Clapham

Joined: Oct 14, 2005
Posts: 19858

Please read this: CarefullyChooseOneForum. Your duplicate post is in a suitable forum so I have locked this one.
I agree. Here's the link:
subject: PDF Text Content extraction using iText5.0.5
jQuery in Action, 3rd edition