HTTP receiving problems with non text files.

Greenhorn

Posts: 3

posted 13 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

So I am currently trying to figure out a problem I am having. I have a client that can request and receive text based files from a server. The requests are pipelined so knowing the exact length of a response body is important. If the content type is "text/<anything>" I will read the body into a character array and save that array to file. Even with large files I am able to keep track of the content-length and number of chars I've currently read in. However when the file is not text, I am having problems with saving it. If the file is a pdf for example I will read the body into a byte array and then write that to file. However the number of bytes read in seems to be shorter than the "Content-Length."

For example I have a pdf that has a content-length of 36975 bytes. After reading in 35799 of that my variables say I have 2159 left, but only 983 bytes are read in. With my logic, upon the next read there is nothing left. Initially I thought my logic was incorrect, but I use the same logic for text files. The only difference is for text I use a BufferedReader and anything else uses InputStream for reading from the socket.

Any ideas?

Paul Clapham

Marshal

Posts: 28226

I like...

posted 13 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Really, if you're just receiving files and saving them locally, you should be treating them all as streams of bytes. Converting the bytes could possibly cause problems if you use the wrong encoding. Converting bytes to chars and then back to bytes is at best a waste of effort.

However if you say your code for copying the stream of bytes has a problem, and you're asking for ideas, then my idea would be to investigate the problem and fix it if necessary.

I say "if necessary" because you don't say that the files are being truncated, you just say your logic doesn't seem to process the number of bytes you think it should process. So, first step, find out if the files are actually being truncated. For example try to open one of the PDFs in Acrobat Reader. If they aren't being truncated then you don't actually have a problem.

Or if they are being truncated, then I would recommend looking at the code to see why. If you can't see why then you could post it here and ask about it.

Java 8 (verified skill)
Skill verified by Paul Clapham

James McIntyre

Greenhorn

Posts: 3

posted 13 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

From what I have found I agree I need to keep the data as a stream of bytes. The following is the code I have for doing that.

InputStream in = clientSocket.getInputStream ();
               BufferedReader br = new BufferedReader(new InputStreamReader(in));
               FileOutputStream fos;
               int left= 0;
               int boydReadIn = 0;
               int tmp, index;
               int sizeOfByte = 1024;
               ....
               while (!(tmp = br.readLine()).equals( "")) {
                    if ((index = tmp.indexOf("Content-Length: ")) != -1) {
                        index += 16; // sets the index to the end of "Content-Length: "
                        tmp = tmp.substring( index );
                        bodyLength = Integer.parseInt( tmp );
                    }
                    if ((tmp.indexOf("Content-Type: text/")) != -1) {
                        isBinary = false;
                    }
                }

....
               //if the file is binary
                else {
                    fos = new FileOutputStream(saveTo);
                    
                    byte[] bytes = new byte[sizeOfByte];
                    //read and write the body
                    while (bodyReadIn != bodyLength) {
                        
                        left = bodyLength - bodyReadIn;
                        // if the remainder of the body is < size of byte array
                        if (left < sizeOfByte) {
                            index = in.read(bytes, 0, left);
                            bodyReadIn += index;
                        }
                        else {//if (left >= sizeOfByte) {
                            index = in.read(bytes, 0, sizeOfByte);
                            bodyReadIn += index;
                        }
                        if (index < 0) {
                            fos.flush();
                            fos.close();
                            break;
                        }
                        fos.write( bytes, 0, index );
                        
                    }
                    fos.close();
                    
                }

For any binary file, index always returns -1 before bodyReadIn = bodyLength. This results in a corrupted file. What am I missing here? I really appreciate the help.

Paul Clapham

Marshal

Posts: 28226

I like...

posted 13 years ago

1
Number of slices to send:

Optional 'thank-you' note:

Send

James McIntyre wrote:From what I have found I agree I need to keep the data as a stream of bytes.

But you're using a Reader. That directly contradicts the idea of keeping it as a stream of bytes.

And worse, it's a BufferedReader. Which means it uses a buffer. So the BufferedReader reads in a few hundred characters from the InputStream you gave it, and you go through that buffer looking for something. Eventually you decide your found it, so then you start reading from the InputStream. You think you will start reading immediately after the last data which you got from the BufferedReader. But no. It has some more data in its buffer which it already took from the InputStream. You won't get that by reading from the InputStream.

James McIntyre

Greenhorn

Posts: 3

posted 13 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

I misunderstood what you said initially. Thank you for clearing that up for me!