Hi! Im having some problems on encoding(or maybe decoding) images from a pdf file. I have learned that PDF files are made up of streams (for images) and my idea is to open the PDF file and extract the stuffs between 'stream' and 'endstream' for images. Then i try to paste those 'strings' in notepad and save it as filename.tiff however what i am expecting to be an image cannot be read by any image reader. May I ask some help with this? Seems i am not getting a valid image file or i am getting a deflated/encoded image file. Thanks a lot.
1. You should be using some Java library for reading PDF documents. I don't know if iText would be sufficient, but it is a popular choice for *writing* PDF documents. 2. Binary != text. You can't paste binary data into NotePad and expect to save it without the data becoming corrupted.
There is no emoticon for what I am feeling!
Joined: Jan 04, 2006
Thanks a lot for the response jeff! I went through iText's list of features however I cant seem to find PDF parsing and image extracting among them. Its nice to have the idea text!=binary confirmed to me, now i think my real problem is on how to convert back the text to binary, or in other words, to encode the text into image.