File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
Win a copy of Clojure in Action this week in the Clojure forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Convert PDF to JPG, byte streams

 
Ivan Bosnjak
Greenhorn
Posts: 15
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi! I'm trying to write a program to convert a batch of images in PDF files(1 picture - 1 file) to JPEG. The pictures are cocooned with a "header" and a "footer" and if I just remove them and rename the filename to JPG I get just what I want.

The problem starts with removing these headers and footers.

In my PDF files I have:

...bunch of PDF bytes...stream....bunch of JPG bytes....endstream...more PDF bytes...

So my idea was this:
1. take the first 1000 bytes, convert them to String and find the index of the keyword "stream"

2. take the last 1000 bytes, convert them to String and find the index of the keyword "endstream"

This way I get the beginning and the end of my JPG file so I can use the code JPEG data and put it in my JPG file.

This is some code I started with:



I keep getting an IndexOutOfBoundsException on this line:



and I can't figure out why?

I would also like to know if you think my way of thinking is OK

Cheers
 
Ulf Dittmer
Rancher
Pie
Posts: 42966
73
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to JavaRanch.

The "off" parameter to in.read is not an offset into the file, it is an offset into the "b" array. You can't read at an arbitrary offset into the file in this way. You need to call the skip method to advance to the point in the file where you want to read, something like this:



This may well not work - be sure to read the javadocs for the skip method.

Also, "String footer = first1000.toString()" won't generally work. File contents are binary and can't be converted to a string like this. You need to search the byte array for the sequence of values that make up ".stream." and ".endstream."
 
Ivan Bosnjak
Greenhorn
Posts: 15
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank you very much Ulf! You saved me a lot of time.

I'm done with the program and it works just like I wanted it.

Thanks again!
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
[Ulf]: This may well not work - be sure to read the javadocs for the skip method.

Not to mention the docs for read(). Both read() and skip() have return values that should not be ignored. an should be put inside loops to ensure that they have their complete desired effect. Or you can use a RandomAccessFile and use seek() and readFully() instead of skip() and read().
 
I agree. Here's the link: http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic