This week's book giveaways are in the Java EE and JavaScript forums.
We're giving away four copies each of The Java EE 7 Tutorial Volume 1 or Volume 2(winners choice) and jQuery UI in Action and have the authors on-line!
See this thread and this one for details.
The moose likes JDBC and the fly likes Hashtable in memory versus database table in memory Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of The Java EE 7 Tutorial Volume 1 or Volume 2 this week in the Java EE forum
or jQuery UI in Action in the JavaScript forum!
JavaRanch » Java Forums » Databases » JDBC
Bookmark "Hashtable in memory versus database table in memory" Watch "Hashtable in memory versus database table in memory" New topic
Author

Hashtable in memory versus database table in memory

mayank gupta
Ranch Hand

Joined: Dec 21, 2008
Posts: 78
I want to create a table in a database. The table should have a key and a value. The table should have entries around 100million but each row is just a 10 digit key and a 7 digit value corresponding to it. I want to load the database in memory.

Instead of using a database table, I have been asked to create a hashtable from the table in the db, on server start up. And load the hashtable in the memory.
I wanted to just use mysql for it but then I was told that no matter how much i optimize it, each query will still have to go through a query processor.
And becuase my query will be just a look up and nothing more complex so
if I create a hashtable in java and load it in memory that overhead could be avoided.


Do you think having a hastable in memory is a better idea than using a database table in mysql?
Also how can i make sure that the hashtable remains cached at all instances?
Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 30356
    
150

You could use a property file on disk. Then you can use Properties to read it in. Since Properties is a hashtable, you have an optimized approach to read it in.

You've confirmed you have enough memory to store this giant hashtable, right? If it has to page to disk, you might as well not be caching.


[Blog] [JavaRanch FAQ] [How To Ask Questions The Smart Way] [Book Promos]
Blogging on Certs: SCEA Part 1, Part 2 & 3, Core Spring 3, OCAJP, OCPJP beta, TOGAF part 1 and part 2
Peter Johnson
author
Bartender

Joined: May 14, 2008
Posts: 5823
    
    7

Why not look into some of the newer key/value databases that have been appearing lately? They are especially adept at handling large volumes of data.
Some possibilities include HBase (from Hadoop), Cassandra and CouchDB. See http://nosql-database.org/ for lots of options.

JBoss In Action
mayank gupta
Ranch Hand

Joined: Dec 21, 2008
Posts: 78
Do you actually think having a hastable is better than having a DB?
The hashtable will have 17 digits in a row, that means 17*100 million approx 1.7GB of RAM.
I do have a larger RAM size on my machine.

Why do you say that
If it has to page to disk, you might as well not be caching.

I want to minimize the disk i/o that is why I want the hashtable in cache. When an update happens (which is very infrequent), I update the table on the disk and then cache it.

Also, how do I cache the Properties/Hashtable? By running the program once? Is there any way to make sure that the table remains cached?

I will also look at the nosql databases.
Ireneusz Kordal
Ranch Hand

Joined: Jun 21, 2008
Posts: 423
mayank gupta wrote:
The hashtable will have 17 digits in a row, that means 17*100 million approx 1.7GB of RAM.
I do have a larger RAM size on my machine.

Read this article first: http://www.javaworld.com/javaworld/javaqa/2003-12/02-qa-1226-sizeof.html?page=1
download an utility (see resources on last page) and do a simple test:

On my machine (JVM 64-bit) a hashtable with 1000 entries (keys of 10 chars, values of 7 chars) consumed 144196 bytes of memory,
average 144 bytes per one entry - 100 million x 144 gives approx 14.4GB of RAM
On 32-bit system memory consumption per one entry should be a little bit smaller, but you will hit a 2GB memory limit per one JVM.
mayank gupta
Ranch Hand

Joined: Dec 21, 2008
Posts: 78
Hi,

Thank you for the input. I am running my program on a machine with 8 cores and 12GB RAM. When i increase the number of entries in the hashtable to more than 1 million, I get a
java.lang.OutOfMemoryError Exception. I am using <Integers, Integer> in the Hashtable and the size of each row/entry is 58 Bytes. With 5 million entries the size of the hashtable will be 290MB. I increased the heap size as: -Xms300m -Xmx500m but I still get the OutOfMemory Error. What could the problem be and how can I resolve it?
Paul Sturrock
Bartender

Joined: Apr 14, 2004
Posts: 10336

Increase the heap size. If you want to find out what is using the memory you will need to profile your application.

JavaRanch FAQ HowToAskQuestionsOnJavaRanch
mayank gupta
Ranch Hand

Joined: Dec 21, 2008
Posts: 78
No matter how much I increase the heap size, I keep getting the error.
What is the significance of the minimum heap size? How much should that be?
mayank gupta
Ranch Hand

Joined: Dec 21, 2008
Posts: 78
Can anybody please suggest? I am not able to run my code beyond 3 million.
Paul Sturrock
Bartender

Joined: Apr 14, 2004
Posts: 10336

If your program needs more memory than the JVM can provide you need to change your program to do whatever you need it to do differently. A 64bit JVM on a machine with 12 GB of RAM gives you a pretty big chunck of available memory.

If you are not sure what is using memory or how it is using it, profile it to find out.
Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 30356
    
150

mayank gupta wrote:Why do you say that
If it has to page to disk, you might as well not be caching.

I want to minimize the disk i/o that is why I want the hashtable in cache. When an update happens (which is very infrequent), I update the table on the disk and then cache it.

Right. But if the cache doesn't fit in memory, you are paging to disk and then it doesn't minimize disk i/o.

mayank gupta wrote:Also, how do I cache the Properties/Hashtable? By running the program once? Is there any way to make sure that the table remains cached?

You cache something by reading it into memory. Presumably you have code to do that if you have a cache. I don't know of a way to make sure that segment of memory remains cached in RAM.
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

I don't think you can control the JVM's or OS's memory management without treachery.

You might be able to use JNI to hold on to a huge chunk of memory... but I guess I'm still not convinced this is really necessary. Without knowing the access pattern I wouldn't want to try and optimize.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Hashtable in memory versus database table in memory