File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes Corrupt PDF Files Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Corrupt PDF Files" Watch "Corrupt PDF Files" New topic
Author

Corrupt PDF Files

Dale DeMott
Ranch Hand

Joined: Nov 02, 2000
Posts: 515
Okay. So all is good and my PDF files are streaming great. Funny thing is, some of the files are coming back as currupt. When I hit them directly using just a path, they are not corrupt. Does anyone have any ideas to what might cause this?


By failing to prepare, you are preparing to fail.<br />Benjamin Franklin (1706 - 1790)
Dave Landers
Ranch Hand

Joined: Jul 24, 2002
Posts: 401
Are you using Reader or Writer rather than InputStream or OutputStream?
Are you using String or char[] rather than byte[]?
Are you ever converting between byte and String or char?
Are you using a mime-type that represents text (so that maybe your browser is doing one of the above)?
[ August 29, 2002: Message edited by: Dave Landers ]
Dale DeMott
Ranch Hand

Joined: Nov 02, 2000
Posts: 515
Thanks for your quick reply. To answer your questions...
Q1) Are you using Reader or Writer rather than InputStream or OutputStream?
A1) I'm using a FileInputStream
Q2) Are you using String or char[] rather than byte[]?
A2) I am using byte to read in the data
Q3) Are you ever converting between byte and String or char?
No.
Q4) Are you using a mime-type that represents text (so that maybe your browser is doing one of the above)?
A4) Yes.
Keep this in mind. This streamer does work with many of the PDF files. But it doesn't work with all of them.
Here is my code.. see what you think...
Dave Landers
Ranch Hand

Joined: Jul 24, 2002
Posts: 401
At first glance, that looks OK to me.
Other things I'd probably check...
Try with other browsers - what I'm thinking here is I have seen IE ignore content type in favor of the file extension ".pdf" on the end of the URL. annoying.
Make sure the rest of your servlet program is not writing things to the output stream that may disrupt the output (setting headers or whatever).
Write a java program to connect to that servlet and dump the retrieved stream back to a file - then you can compare the files and see where the problem is - is it truncated or is is the data changing or what else. Might give you a clue what to look for.
Good luck - chances are when you are done you will either cry "Doh!" or you will have learned something really useful.
Dale DeMott
Ranch Hand

Joined: Nov 02, 2000
Posts: 515
Good idea about writing the file back out and comparing. Will do.
-Dale
David Weitzman
Ranch Hand

Joined: Jul 27, 2001
Posts: 1365
The BufferedOutputStream seems a bit redundant (you do some buffering yourself), but does it help if you flush() it?
Dale DeMott
Ranch Hand

Joined: Nov 02, 2000
Posts: 515
Okay, so here's the deal. It seems that the file is being truncated. Streaming the file back to another file to do a compare worked great! Terrific idea! I'm about 389 bytes short. It seems those last bytes are being lost. Any ideas in how to avoid this?
Dale DeMott
Ranch Hand

Joined: Nov 02, 2000
Posts: 515
Originally posted by David Weitzman:
The BufferedOutputStream seems a bit redundant (you do some buffering yourself), but does it help if you flush() it?

Hmm.. yeah. I think you're right. It might be redundant. Which method might be more useful to keep?
Dale
David Weitzman
Ranch Hand

Joined: Jul 27, 2001
Posts: 1365
I would just write strait to outStream. And did you try flush()ing?
Dale DeMott
Ranch Hand

Joined: Nov 02, 2000
Posts: 515
Originally posted by David Weitzman:
I would just write strait to outStream. And did you try flush()ing?

Thanks.. Flushing out the buffer did the trick. Thanks so much!
Dale
Ron Newman
Ranch Hand

Joined: Jun 06, 2002
Posts: 1056
Glad that worked, but I'm wondering why the buffer didn't flush automatically when it was closed (or when it was finalized).


Ron Newman - SCJP 1.2 (100%, 7 August 2002)
Dave Landers
Ranch Hand

Joined: Jul 24, 2002
Posts: 401
The servlet container would have flushed the response.getOutputStream() stream probably just by closing it). But it would not know about leftover bytes sitting in the BufferedOutputStream that wrapped it. Those are the ones that got lost.
Flushing that buffer gets those bytes to the response stream where the servlet container can flush them to the client.
Thomas Paul
mister krabs
Ranch Hand

Joined: May 05, 2000
Posts: 13974
Threads like this make me so proud to be a participant in JavaRanch!


Associate Instructor - Hofstra University
Amazon Top 750 reviewer - Blog - Unresolved References - Book Review Blog
Michelle Chen
Greenhorn

Joined: Mar 20, 2008
Posts: 1
Hi,

I think you can try a utility called Advanced PDF Repair to repair your PDF file. It works rather well for my corrupt PDF files. Its web address is http://www.datanumen.com/apdfr/
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Corrupt PDF Files