This week's book giveaway is in the Servlets forum.
We're giving away four copies of Murach's Java Servlets and JSP and have Joel Murach on-line!
See this thread for details.
The moose likes Performance and the fly likes Most efficient way to dump a large HashMap? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Murach's Java Servlets and JSP this week in the Servlets forum!
JavaRanch » Java Forums » Java » Performance
Bookmark "Most efficient way to dump a large HashMap?" Watch "Most efficient way to dump a large HashMap?" New topic
Author

Most efficient way to dump a large HashMap?

Db Riekhof
Greenhorn

Joined: May 21, 2009
Posts: 10
We're thinking of using HashMap to get rid of duplicate records having the same key value. So we shove everything into a large Map--it will probably be somewhere between 100,000 and 10 million records per run after removing duplicates. The key will be around 20 characters, and the value will be around 300 characters, both String objects. (rough guestimates)

So, after processing all the records, and removing dupes we'd like to write all the values out to a file. Don't need to sort. There seems to be a few options on how to do this with HashMap:

1) Getting a Set of keys, iterating the through the keys and pulling each value.
2) Getting a Set of values, iterating through those.
3) Getting a Collection of values.

Anyone have a good understanding of the consequences of each choice above in terms of memory usage and performance?
If all of these prove to be too slow and require too much memory, then we'll have to look for another data structure or write a custom one.

Looking for some advice/guidance on this. Anyone have an opinion on the best way to handle this problem?
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12760
    
    5
I think it likely that the Collection interface to the HashMap values is the cheapest to create based on the java.util.HashMap API documentation.

If you want to dig into the implementation, look at the source code - here is the values() method for example.



I can't imagine how it could be faster.

Bill
SumitPal Pal
Greenhorn

Joined: Aug 31, 2010
Posts: 21
My first question to you in solving this is
"We're thinking of using HashMap to get rid of duplicate records having the same key value. So we shove everything into a large Map "

You mention the above - where are you shoving it from - is it all in memory right now in an Array / are things in a DB already.

That I think will help us guide you better
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Most efficient way to dump a large HashMap?
 
Similar Threads
key pressed in JList and selecting the item based on that
SET function
hashmap or other structure?
NX: URLYBird / my approach of the reading problem
Map.Entry Details help