File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Java in General and the fly likes Bitsets and memory consumption Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Bitsets and memory consumption" Watch "Bitsets and memory consumption" New topic

Bitsets and memory consumption

Varun Chopra
Ranch Hand

Joined: Jul 10, 2008
Posts: 211
I want to understand memory consumption at Runtime by jvm to allocate objects. Look at the code below.
It creates 20000 bitset objects, each of size 5000 bits. If I serialize these objects on disk, it creates a file of size 12.4mb (as expected). But at runtime in memory, it takes different amount of space.

Code has breaks in between, so as to check how much RAM is consumed at that point.

1. First break is before any bitset object is created. Java process takes 13mb RAM at that point.
2. When first 20000 objects are created, RAM consumption is 37mb (a jump of 14mb!!).
3. After 20000 bitset objects are cloned (so there are 2 copies in RAM), consumption is 54 mb (jump of 11 mb)
4. After 20000 bitset objects are cloned for 2nd time (3 copies in RAM), consumption is 64 mb (jump of 10 mb)
5. After 20000 bitset objects are cloned for 3rd time (4 copies in RAM), consumption is 76 mb (jump of 12 mb)
6. After 20000 bitset objects are cloned for 4th time (5 copies in RAM), consumption is 92 mb (jump of 14 mb)

If I look at value of a bitset object at runtime in debugger, I see integer positions of set bits (like {1},{2}......).
questions are:

a) are bitsets maintained as bits in RAM or converted to integer positions (or is it just the toString method of BitSet!!)?
b) can one explain the memory consumption at runtime little bit?

-Varun -
(My Blog) - Online Certifications - Webner Solutions
you li

Joined: Jan 17, 2005
Posts: 3
a) yes the bitsets are maintained as bits. actually its maintained as an array of long. and every long in the array can hold a number of bits. so basically it is loaded into ram the same way the vm loads a long. the toString prints the position by walking trough all positions and see if it is set or not. if it is it prints the position.

b) i suspect the actual increase is roughly 12mb. but a lot of extra factors make it not as consistent. thinking of garbage collection, overhead in the cloning and other things. i am sorry i cannot be more specifick this is purely a guess based on my limited knowledge of the memory management of java
Mike Simmons
Ranch Hand

Joined: Mar 05, 2008
Posts: 3028
Well, I don't know how you're measuring the memory being used here, but it's probably not very accurate, based on your results. From my own observations, a large BitSet takes one byte for every 8 bits, plus a little extra overhead. Much as you'd expect - looking at the source of BitSet, the bits are stored as parts of long values ina a long[] array, with 64 bits in each long. Some implementations may have a very different result.

Here's the code I used to verify it uses 1 byte for every 8 bits (plus minor overhead):

As you can see, it may take a little time for the results to stabilize, but then they're pretty solid. I guess that's because of other stuff going on inside the JVM, which we can't see. But if you keep adding large BitSets, eventually those other effects go away, and you just see the results of the BitSet.
Varun Chopra
Ranch Hand

Joined: Jul 10, 2008
Posts: 211
Thanks You and Mike for answering.
Mike, I am using top command on Linux and monitoring java process's memory consumption.
I tried your program and yes it does take consistent storage, but if you run the loop for high number of iterations (1000 times for e.g.), you will see that some bits take more space. Here's the output of some bits when I ran your program for 600 loop iterations :

Usual change was +1000056, you can see it keeps on going up for some bits, hence I guess overall overhead is more in total.

Bitset 10 used 11132456, change of +1000064
Bitset 11 used 12135664, change of +1003208
Bitset 100 used 101141056, change of +1000464
Bitset 151 used 152144520, change of +1000664
Bitset 227 used 228149688, change of +1000968
Bitset 341 used 342157440, change of +1001424
Bitset 512 used 513169072, change of +1002112

Mike Simmons
Ranch Hand

Joined: Mar 05, 2008
Posts: 3028
Most of those would be the ArrayList resizing itself. Note that I pre-sized it to 100 (thinking that would be more than enough to avoid this effect). In JDK 6 at least, the formula it uses when it needs to grow is

which explains why at 100 it jumps, then again at 151, then at 227, then 341, then 512.

Offhand I don't know why there would also be a jump at 11 as well, but I'm not too concerned about it. There are probably other things going on in the JVM that we don't know details of.
Varun Chopra
Ranch Hand

Joined: Jul 10, 2008
Posts: 211
Good to know that. Thanks Mike.
I agree. Here's the link:
subject: Bitsets and memory consumption
It's not a secret anymore!