I can't speak for what's fastest, but here's the code I use:
In terms of efficiency, one important thing is to make sure you're using StringBuffers not Strings (as String concatenation is expensive). However, it may be that the speed of your network is sufficiently slow that the concatenation isn't what's causing the slowness.
BTW, this code is hackish ... I've only used it when playing around. I haven't put much time into writing it particularly prettily. In particular, the while loop is nasty and C-like. I would rewrite it in production code...but you get the idea
--Tim [ June 27, 2004: Message edited by: Tim West ]
Tim West explanation is reasonable... Using StringBuffer is better than using String at least in long characters processing...
Craig Sullivan, what do u mean by "faster way to read the web page"? Do u mean which readers or inputstreams are supposed to be used in ur code? Could you provide more info about your code so that we can help you much more than you can imagine?
I want to know the fastest way to get the content of the page from the server to my Java client.
Joined: Mar 15, 2004
Well, you're limited by two things:
The speed of the connection between the remote server and your local box.
The speed of your Java code.
For the latter, implement a decent solution that uses a BufferedReader and StringBuffers not Strings. I'm not aware of anything else that will significantly increase your code speed in this situation. If there is anything, I'm sure someone else will point it out soon.
Then, unless your connection is really fast, I'd say it's highly likely that your connection, not the code, is the performance bottleneck. So, upgrade your inter|intranet facilities
In any case, it should be relatively simple to profile your code to work out which methods are taking most time. Then you can decide where to optimise next.
Joined: Nov 21, 2002
My main concern is with reducing round-trips to and from the server. Is there a certain method of downloading the data from the server that will reduce round-trips?
For example, does BufferedReader.readLine() use more round trips than BufferedReader.read()? I tried changing the buffer size, but the largest download I could get was 2555 bytes. Is the max buffer size dependent on HTTP or is there some parameter within the JDK that I can change?
Joined: Mar 15, 2004
Hmm. I'm not qualified to give a definitive answer at this point, but I can offer some more thoughts.
Firstly, the size of any given packet (at the lowest level) is determined by your Maximum Transfer Unit, or MTU. This is an OS-level concern, and something Java has no control over. For NICs, it's generally around 1480 bytes (I think. At least, it is for me).
This is the maximum packet size. It includes all the HTTP/TCP/IP headers, checksums and whatever else the different layers on the network stack put in. So, you don't get a huge amount of data in an individual packet. I'm not familiar enough with the various protocols to know, but I think any network connection always involves round trips of a sort - the TCP 3-way handshake to start, then the process of accepting each packet from the source and requesting more data. Do you want to reduce this sort of round trip, or have I missed something?
However, all this is transparent to a Java app. As far as Java's concerned, you get a byte stream (well, URL.getStream() returns an InputStream) and read happily away.
I think from a Java POV, all you can do is use a larger buffer in the BufferedReader. Then you avoid the possibility that the buffer could fill and the connection would have to stall. That said, I'd guess most OSs would buffer network connections themselves, but that is complete speculation.
Anyway, there are some random thoughts that may or may not help.
I'm curious though - what do you mean the largest download you could get was 2555 bytes? Is that one packet or the total download size?
Dunno whether I helped or not jus' then, but there ya go
Joined: Nov 21, 2002
Here's the deal as far as I know: HTTP has a flexible window. HTTP will send more or less packets at one time without an ACK from the client depending on the speed and stability of the network.
After I created a BuffereReader, I called bytesAvailable and got back 2555, or some such. This tells me that I can only read in 2555 bytes at one time.
When I use BufferedReader.readLine() to read a 8 MB web page, it takes 25 seconds. When I used my web browser, it takes 5 seconds. Somewhere, I don't know where, the amount I can read in at one time is being limited to 2555 bytes. I don't believe my TCP/IP stack is limiting my download. I believe there is some parameter in Java that is limiting the number of bytes I can DL at one time to 2555. If the download size were bigger, not as many ACKs would be sent from my client, and the download would be faster.
I may need to dig into the JDK to see what's going on. [ July 02, 2004: Message edited by: Craig Sullivan ]
Joined: Mar 15, 2004
Hmm, this is out of my depth now :-)
To confirm your ideas on packets, you might like to use Ethereal (or something) to see if there are any obvious differences between the way Java is doing TCP/IP as compared to your web browser.
Also, did you play with the size of the buffer in the BufferedReader? Make it 8Mb and see if you get speeds comparable with the browser.
Anyway, what I'm writing now is speculation more than well-founded advice, so take it at your peril Would be interesting to know what the cause of all this is, though.
Your code will likely be much faster than the Internet. I have a little program that downloads files and shows the bytes per second after every 1k bytes. I can run one thread or five and the BPS is the same for each. My code is not the bottleneck. If I had a need for 50 threads it might be.
A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
BufferedReader.readLine() That has a huge overhead - converting a byte stream to characters, building a line, finally converting to string. For speed - 1. Never convert to characters - stay with bytes 2. Start with a monsterous byte and read directly into it - probably with the read( buf, offset, length) method, where length is the result of calling available. Or you might use the ServletInputStream readLine(buf,off.length) method which will return -1 at the eof, and will let you count lines. Bill
Hmm, so based on William's post, using a BufferedInputStream over a BufferedReader is definitely a good thing - you get the advantages of buffering without the overhead of character/String conversion.
Author and all-around good cowpoke
Joined: Mar 22, 2000
Right - but I think that reading directly from the ServletInputStream would be best. Remember, the operating system TCP/IP stack already has a buffer to hold a packet (?or maybe more than one?) - there is no need to introduce another buffer, just grab the bytes as they become available. Bill