aspose file tools*
The moose likes Other Open Source Projects and the fly likes Replace complete pdf text with any text preserving styling, format & images. Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Products » Other Open Source Projects
Bookmark "Replace complete pdf text with any text preserving styling, format & images." Watch "Replace complete pdf text with any text preserving styling, format & images." New topic
Author

Replace complete pdf text with any text preserving styling, format & images.

krishnann ravi
Greenhorn

Joined: Apr 21, 2012
Posts: 1
Hello people!

In my project, I have to identify all text contained within a pdf, and replace it with any other text. Actually I want the replacement text to be meaningful, but since that seems too tedious, I'm thinking of cutting down my project to just *any* text.

The format & styling ( including images ) of the output pdf should be preserved, and the text should not overflow over the images. I have considered PDF manipulation libraries from iText & Apache PDFBox so far.

In Apache PdfBox, there's a program called "ReplaceString", but it needs a specific "string to replace" and a specific "replacement string". The problem here is that since I need to replace all the words of the pdf with *any* text, so a single string replacement doesn't serve the purpose.

Here is the approached I have thought of:

Something which reads every word, counts the number of characters in the word, and replaces it on-the-spot with *any* same character count word. Maybe we can use a test condition for character count from 1 to 15.

My deadline is approaching, and I have not been able to do much because of being off track.

It would be great if someone could guide me as to how I should approach this, and if a similar work has been done in the past which I could use and build on.

Thanks very much!

Regards,

Ravi Krishnan
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Replace complete pdf text with any text preserving styling, format & images.