The following java app reads an entire file in using a BufferedReader(new FileReader). Why is it that reading a 5 mb file uses about 35800 K of ram? There must be a better way to do FileIO. Now I know I can use RandomAccessFiles to read chunk by chunk, but I'm just curious why java uses so much memory to read in a 5 meg file and is there are more memory efficient way to read the entire file into ram? I can see maybe using 10 megs, but 35?
FYI, i just put the while(true) there to see the memory usage before the app exits.
First, Java uses Unicode -- 16 bit characters. Therefore, the minimum amount of space needed to store the data is 10M, not 5.
Second, Java, being a garbage collected language, often keeps stuff around that it's not really using, and tries to keep some free space available. So whereas a (space efficient, time inefficient) malloc library might only grow its heap as needed, the Java heap's size is always larger than needed to hold the data.
Third, the JVM itself is kinda big; just loading the core classes takes up a non-negligible amount of space.
Lastly, how big the heap grows depends on how much "object churn" there is in the program -- how many objects are created and destroyed. This one is pretty bad, actually: one String for each line of the file, plus many, many array resizings as the StringBuffer grows incrementally. There are much more efficient ways to write this program.
- You could use the StringBuffer constructor that takes a capacity as an argument. This will avoid the need to ever copy and resize the internal char array.
- More importantly, you could just use FileReader.read() to read data into a char buffer, then append these characters directly to the StringBuffer, without ever creating Strings. You could use BufferedReader this way too, but if you're reading in big chunks it shouldn't really matter and will use less memory: