I want to read parts of a file (e.g. position 375-392, or 4,586-4,599). What's the best way to do this? I assume I don't want to load the whole thing into an array via InputStream, right? What if I'll be eventually reading the entire file contents by the time the processing is done but I'll be jumping all over the file? Does that make a difference?
Hi Robert, I would say the fastest way to do this would be to create a RandomAccessFile and get the Channel (from nio) to read and write the file. The reading and writing of files have be massively improved by nio. In my expereance without nio RandomAccessFile is very slow. The idea of reading in the whole file may be quicker depending on the size of the file and the percentage of the file you want to proccess. If it is only a small file, it may be quicker to just read the whole thing in. Chris
[Chris]: In my expereance without nio RandomAccessFile is very slow. My experience matches this, for JDK 1.2 and 1.3. From this I developed a habit of avoiding RAF like the plague; I could always find a faster solution using streams. Even for random access, you can create a new FileInputStream and use skip() to get to the spot you want, and read from there - and I found this to be faster than using the evil RAF. Even if I had to create a new FIS for each separate read (since you can't move backwards with a skip()). However recently I broke my own rules and tried using RAF again. It seems that the RAF in 1.4 is comparable to using a FileChannel - it's much, much faster than it used to be. In some cases it may even be faster than FileChannel, but that's probably because it's not always obvious how to get the best performance from FileChannel, and there's a bit of a learning curve with NIO. So for JDK 1.4+, I'd say there are three basic options:
Just use RAF.
Use FileChannel's seek(long) and read(ByteBuffer) methods.
Use FileChannel's map() to get a MappedByteBuffer of the whole file.
The last takes more overhead - it doesn't make sense for a short file you're only accessing a few times, but if it's a big file, and/or if you're going to be accessing it a lot, it's the best way to go. Between the other two options, I'm not sure which is really preferable; test and find out. RAF is probably simpler, unless you want to use some of the other NIO-specific methods or classes. E.g. FileChannel's transferTo() and transferFrom() are pretty slick if you've got other channels to interact with. And a Selector is great for running an efficient server, which then encourages you to use channels and buffers throughout the system. But if you aren't needing other NIO features like that, RAF is probably fine, nowadays. But if you do have the misfortune to be using a JDK < 1.4, just stick to FileInpoutStream and skip(). RAF sux.
"I'm not back." - Bill Harding, Twister
Joined: Jul 28, 2002
Thanks for the replies guys! Those helped a lot. We were using 1.3 but luckily switched to 1.4 recently, so that shouldn't be a problem. I think the best option is FileChannel's map(). BTW, what do you consider a big file? 1 meg+, 2 meg+? More?
Joined: Jan 30, 2000
what do you consider a big file? I don't really know. The API for map() seems to indicate that something like 10-30 kB is still "small" for most systems. I'll guess that once you're in the MB range it's considered "big". But really, that's just a guess; I've done almost no direct comparison here, and the results probably vary a lot by machine anyway.