Hi! I'm trying to write a program to convert a batch of images in PDF files(1 picture - 1 file) to JPEG. The pictures are cocooned with a "header" and a "footer" and if I just remove them and rename the filename to JPG I get just what I want.
The problem starts with removing these headers and footers.
In my PDF files I have:
...bunch of PDF bytes...stream....bunch of JPG bytes....endstream...more PDF bytes...
So my idea was this: 1. take the first 1000 bytes, convert them to String and find the index of the keyword "stream"
2. take the last 1000 bytes, convert them to String and find the index of the keyword "endstream"
This way I get the beginning and the end of my JPG file so I can use the code JPEG data and put it in my JPG file.
The "off" parameter to in.read is not an offset into the file, it is an offset into the "b" array. You can't read at an arbitrary offset into the file in this way. You need to call the skip method to advance to the point in the file where you want to read, something like this:
This may well not work - be sure to read the javadocs for the skip method.
Also, "String footer = first1000.toString()" won't generally work. File contents are binary and can't be converted to a string like this. You need to search the byte array for the sequence of values that make up ".stream." and ".endstream."
Joined: Jun 08, 2007
Thank you very much Ulf! You saved me a lot of time.
I'm done with the program and it works just like I wanted it.
[Ulf]: This may well not work - be sure to read the javadocs for the skip method.
Not to mention the docs for read(). Both read() and skip() have return values that should not be ignored. an should be put inside loops to ensure that they have their complete desired effect. Or you can use a RandomAccessFile and use seek() and readFully() instead of skip() and read().