I've just started to develop an image viewer. To keep the UI responsive while the program creates the thumbnail view of the images I started using SwingWorker. It works fine but it can be faster, so I decided to try with an ExecutorService. I don't understand the difference in memory usage between the two implementations I've tryed in the example code.
If you set useSwingWorker to true an run the program the memory usage is reasonable, comparable with similar programs. Memory usage is incremented in small chunks, growing and shrinking a little while creating the thumbnails.
If you set useSwingWorker to false, the program uses a lot more memory. Memory usage is incremented in big chunks and rapidly. If the number of newFixedThreadPool is 1, then the memory usage is about the same that in the useSwingWorker=true case.
As an example, for a directory with about 2500 images, running the program with useSwingWorker=true it uses about 400MB, while if you run it with useSwingWorker=false it uses about 1.5GB. The program uses imgscalr to create the thumbnails.
I haven't gone into your code in detail, but since you mentioned that the memory usage on executor service is same as swing worker when number of threads = 1, it could be that the reason executor service takes more memory is simply because you are trying to do more with it
One thing that is somewhat unique to any languages that use a GC is that generally speaking GC is lazy. It doesn't run all the time since it would be very efficient to scan the object hierarchy in the heap very frequently. GC runs only when it absolutely required. That means when you profile the heap in an application that is constantly doing soe processing, you will see heap usage constantly go up until it reaches the peak, and then GC triggers which brings the memory usage back down. The rate at which the memory increases ( and hence the frequency at which GC runs) depends on how much memory the app is using and releasing.
So. It is possible that if you run multiple threads, or use a lot of memory in one thread your memory usage will go up faster and GC will run more frequently. This is ok upto a certain limit. This is expected behavior. To draw an analogy, the more people at your thanksgiving lunch, the faster the sink will fill with dirty dishes. There's nothing wrong with the sink filling up with dirty dishes as long as people at the party are having fun.
It is possible that you may drive your processing far enough that the CPU spends more time running GC than it does in your threads. Now it is a problem. It's like the host of the party spending time cleaning dishes rather than having fun with the guests. This is what you have to watch out for. You have setup your executor service for CPU bound process. However, looking at what it's trying to do it seems like your app will be memory bound.
What you need to do is monitor the GC. If it is spending a lot of time in GC either increase the memory or reduce the threads
Joined: Nov 23, 2012
I think you are right: the more you want, the more you pay. I did try splitting the input file list in 6 queues (one per core) and the memory usage was as if you were running the ExecutorService test. Simply changing the garbage collector algorithm (-XX:+UseConcMarkSweepGC -XX:+UseParNewGC) the memory usage decreases a lot, just about 100MB more than the single SwingWorker test.
Either way, I would encourage you to use JConsole to monitor the heap. It tells you exactly how long the JVM spent in GC. Concurrent GC is a good choice for a highly concurrent process. Even if COncurrent GC solves your problem, looking at the heap usage will verify what you might already know.