File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes I/O and Streams and the fly likes HTTP receiving problems with non text files. Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "HTTP receiving problems with non text files." Watch "HTTP receiving problems with non text files." New topic
Author

HTTP receiving problems with non text files.

James McIntyre
Greenhorn

Joined: Feb 10, 2011
Posts: 3
So I am currently trying to figure out a problem I am having. I have a client that can request and receive text based files from a server. The requests are pipelined so knowing the exact length of a response body is important. If the content type is "text/<anything>" I will read the body into a character array and save that array to file. Even with large files I am able to keep track of the content-length and number of chars I've currently read in. However when the file is not text, I am having problems with saving it. If the file is a pdf for example I will read the body into a byte array and then write that to file. However the number of bytes read in seems to be shorter than the "Content-Length."

For example I have a pdf that has a content-length of 36975 bytes. After reading in 35799 of that my variables say I have 2159 left, but only 983 bytes are read in. With my logic, upon the next read there is nothing left. Initially I thought my logic was incorrect, but I use the same logic for text files. The only difference is for text I use a BufferedReader and anything else uses InputStream for reading from the socket.

Any ideas?
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

Really, if you're just receiving files and saving them locally, you should be treating them all as streams of bytes. Converting the bytes could possibly cause problems if you use the wrong encoding. Converting bytes to chars and then back to bytes is at best a waste of effort.

However if you say your code for copying the stream of bytes has a problem, and you're asking for ideas, then my idea would be to investigate the problem and fix it if necessary.

I say "if necessary" because you don't say that the files are being truncated, you just say your logic doesn't seem to process the number of bytes you think it should process. So, first step, find out if the files are actually being truncated. For example try to open one of the PDFs in Acrobat Reader. If they aren't being truncated then you don't actually have a problem.

Or if they are being truncated, then I would recommend looking at the code to see why. If you can't see why then you could post it here and ask about it.
James McIntyre
Greenhorn

Joined: Feb 10, 2011
Posts: 3
From what I have found I agree I need to keep the data as a stream of bytes. The following is the code I have for doing that.


For any binary file, index always returns -1 before bodyReadIn = bodyLength. This results in a corrupted file. What am I missing here? I really appreciate the help.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

James McIntyre wrote:From what I have found I agree I need to keep the data as a stream of bytes.


But you're using a Reader. That directly contradicts the idea of keeping it as a stream of bytes.

And worse, it's a BufferedReader. Which means it uses a buffer. So the BufferedReader reads in a few hundred characters from the InputStream you gave it, and you go through that buffer looking for something. Eventually you decide your found it, so then you start reading from the InputStream. You think you will start reading immediately after the last data which you got from the BufferedReader. But no. It has some more data in its buffer which it already took from the InputStream. You won't get that by reading from the InputStream.
James McIntyre
Greenhorn

Joined: Feb 10, 2011
Posts: 3
I misunderstood what you said initially. Thank you for clearing that up for me!
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: HTTP receiving problems with non text files.