This week's book giveaway is in the Servlets forum.
We're giving away four copies of Murach's Java Servlets and JSP and have Joel Murach on-line!
See this thread for details.
The moose likes I/O and Streams and the fly likes Writing memory efficient file IO code. Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Murach's Java Servlets and JSP this week in the Servlets forum!
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "Writing memory efficient file IO code." Watch "Writing memory efficient file IO code." New topic
Author

Writing memory efficient file IO code.

Gordon Glas
Greenhorn

Joined: Sep 09, 2004
Posts: 4
The following java app reads an entire file in using a BufferedReader(new FileReader). Why is it that reading a 5 mb file uses about 35800 K of ram? There must be a better way to do FileIO. Now I know I can use RandomAccessFiles to read chunk by chunk, but I'm just curious why java uses so much memory to read in a 5 meg file and is there are more memory efficient way to read the entire file into ram? I can see maybe using 10 megs, but 35?

FYI, i just put the while(true) there to see the memory usage before the app exits.

Thanks so much,
Gordon


Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24183
    
  34

Hi Gordon,

Welcome to JavaRanch!

Lots of things I can tell you.

First, Java uses Unicode -- 16 bit characters. Therefore, the minimum amount of space needed to store the data is 10M, not 5.

Second, Java, being a garbage collected language, often keeps stuff around that it's not really using, and tries to keep some free space available. So whereas a (space efficient, time inefficient) malloc library might only grow its heap as needed, the Java heap's size is always larger than needed to hold the data.

Third, the JVM itself is kinda big; just loading the core classes takes up a non-negligible amount of space.

Lastly, how big the heap grows depends on how much "object churn" there is in the program -- how many objects are created and destroyed. This one is pretty bad, actually: one String for each line of the file, plus many, many array resizings as the StringBuffer grows incrementally. There are much more efficient ways to write this program.

- You could use the StringBuffer constructor that takes a capacity as an argument. This will avoid the need to ever copy and resize the internal char array.

- More importantly, you could just use FileReader.read() to read data into a char[] buffer, then append these characters directly to the StringBuffer, without ever creating Strings. You could use BufferedReader this way too, but if you're reading in big chunks it shouldn't really matter and will use less memory:



Hope this helps.


[Jess in Action][AskingGoodQuestions]
Gordon Glas
Greenhorn

Joined: Sep 09, 2004
Posts: 4
Thanks Ernest,
It makes more sense to reuse a single buffer. I was able to get memory usage down to about 28 megs with your code above, like so:



It's really fast too, because there is only a single read operation (because it's allocating the char buffer by the file's length)

Anyway, I will be using RandomAccessFiles for larger files anyway, but I just wanted to have a better understanding.

Thanks again!
Gordon
relli Toto
Greenhorn

Joined: Feb 03, 2009
Posts: 12
Nice example Gordon,

Only problem is, when the file is big (35Mo), you'll get a memory problem.
The best solution is to read part of the file, and write it, and then read the rest and so on, in regular same size chunks!
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
 
subject: Writing memory efficient file IO code.
 
Similar Threads
BufferedReader .readLine() is not seeing end of file
why bufferedreader behaves like this?
Unexpected results from StringTokenizer
How to read the same ascii file multiple times without having to re-opening it
How to replace a line in a file