aspose file tools*
The moose likes Performance and the fly likes Overuse Java HashMap Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Performance
Bookmark "Overuse Java HashMap" Watch "Overuse Java HashMap" New topic
Author

Overuse Java HashMap

chen shu
Greenhorn

Joined: May 07, 2012
Posts: 13
I saw a lot of guys(young and elder developers) overuse Java HashMap.They used it to save a few key/value pairs.I am very sure the number of pairs is less than 10.
I researched the source codes of HashMap,it's good,but it should be used to contain a large number objects.I believe this is a very bad practice.It used too much memory when just saving several objects.
I measured the size of empty HashMap object,it's about 300 bytes.Wow,don't tell me 300 bytes is okay.If you are a server-side programmer,you will find memory is not enough when processing thousands of requests from Internet.Also,the hash algorithme used in get/put methods use more CPU time than array.

I believe it's a very bad practice,these guys didn't know what they were doing.
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

Hi, chen shu, and welcome to the Ranch!

chen shu wrote: I saw a lot of guys(young and elder developers) overuse Java HashMap.They used it to save a few key/value pairs.I am very sure the number of pairs is less than 10.
I researched the source codes of HashMap,it's good,but it should be used to contain a large number objects.I believe this is a very bad practice.It used too much memory when just saving several objects.


No, it doesn't. The overhead of a few tens or hundreds ,of bytes is negligible.

I measured the size of empty HashMap object,it's about 300 bytes.Wow,don't tell me 300 bytes is okay.


It is totally okay, unless you have measured a real bottleneck in real production code, and then it's only not okay in that particular circumstance.

If you are a server-side programmer,you will find memory is not enough when processing thousands of requests from Internet


10,000 requests X 300 bytes per request is 3,000,000 bytes. This is nothing in a server machine that is made to handle 10,000 simultaneous requests.

Also,the hash algorithme used in get/put methods use more CPU time than array.


Again, the difference is negligible, and, unlike the array, the HashMap case scales up several orders of magnitude with no significant--or even noticeable, in most cases--decrease in performance.

I believe it's a very bad practice,these guys didn't know what they were doing.


You are engaging in what is known as premature optimization. You're picking things that you assume will make a difference in how your app performs, but there is an exceedingly high probability that your guess is wrong. Use proper algorithms and data structures that fit your design and make your code easy to understand and don't try to save a few bytes or cycles here or there because you think it will make it more "efficient."

chen shu
Greenhorn

Joined: May 07, 2012
Posts: 13
Thanks for your response.
But sorry,I totally disagree with you.
I am a senior C++ and Java developer over ten years.
I worked together with Google Map China Director.He told me creating a Java object is expensive,don't do it if you can design a singleton object.
Saving memory and CPU time is very useful for Java service Apps.
Maybe you think you are right,but I think my solution (using array instead of hashmap to contain a few objects) is better than yours.
My question is why don't you choose the better one?It's not a very difficult way.It's very easy to implement as far as I know.
Thanks.
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

chen shu wrote:He told me creating a Java object is expensive,don't do it if you can design a singleton object.


Very bad advice. Maybe 15 years ago creating objects in Java was "expensive". Not really the case now. And, in those rare cases when singleton is appropriate, it is for design reasons, NOT to save memory or CPU cycles.

Saving memory and CPU time is very useful for Java service Apps.


Saving it in the appropriate way in the appropriate places is useful. Worrying about a few hundred or a few thousand bytes because you think it will be more "efficient" when you haven't actually measured a significant bottleneck is very much not useful. There's a lot of good literature on this subject. Google for write dumb code or premature optimization to find some examples.

My question is why don't you choose the better one?


I do. I choose the one that appropriately reflects my design intent. So, if I have a handful of objects, and I want to access them by some key--say an "id" or "name" field--then I use a Map. This is the correct approach regardless of whether it's 5 objects or 5 million. The code is easier to write, easier to understand, less likely to be buggy, not noticeably wasteful, and, very importantly, scales well--it may be 5 objects today, but if it's 5,000 next month, my code will still work just as well, with no changes.
Bear Bibeault
Author and ninkuma
Marshal

Joined: Jan 10, 2002
Posts: 60063
    
  65

I am completely in agreement with Jeff. Using inappropriate constructs to hold data because of a savings a few hundred bytes is a poster case for premature optimization.


[Asking smart questions] [Bear's FrontMan] [About Bear] [Books by Bear]
chen shu
Greenhorn

Joined: May 07, 2012
Posts: 13
Jeff Verdegan wrote:I do. I choose the one that appropriately reflects my design intent. So, if I have a handful of objects, and I want to access them by some key--say an "id" or "name" field--then I use a Map. This is the correct approach regardless of whether it's 5 objects or 5 million. The code is easier to write, easier to understand, less likely to be buggy, not noticeably wasteful, and, very importantly, scales well--it may be 5 objects today, but if it's 5,000 next month, my code will still work just as well, with no changes.


Actually,If you just want to find a object using key/value way,you can try TreeMap,Properties.Java provides these classes already.HashMap is not the only choice.

I list all subclasses from JDK here:
All Known Implementing Classes:
AbstractMap, Attributes, AuthProvider, ConcurrentHashMap, ConcurrentSkipListMap, EnumMap, HashMap, Hashtable, IdentityHashMap, LinkedHashMap, PrinterStateReasons, Properties, Provider, RenderingHints, SimpleBindings, TabularDataSupport, TreeMap, UIDefaults, WeakHashMap

I mean we have a lot of choice,why you like HashMap so much?


chen shu
Greenhorn

Joined: May 07, 2012
Posts: 13
Bear Bibeault wrote:I am completely in agreement with Jeff. Using inappropriate constructs to hold data because of a savings a few hundred bytes is a poster case for premature optimization.


Sorry,using Array means premature optimization?
Do you mean we should use HashMap every where? Do you mean we can't use Array from now on?

Maybe you could make a rule,don't use any data structure,we just need HashMap.


Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

[code=java]
chen shu wrote:
Bear Bibeault wrote:I am completely in agreement with Jeff. Using inappropriate constructs to hold data because of a savings a few hundred bytes is a poster case for premature optimization.


Sorry,using Array means premature optimization?


If you're using it because you're trying to avoid wasting a few hundred bytes in a map, even though you're going to be looking something up by a key, then, yes, it definitely is.

Do you mean we should use HashMap every where? Do you mean we can't use Array from now on?


Nobody said anything even close to that. Please pay attention to what is actually being said.
Bear Bibeault
Author and ninkuma
Marshal

Joined: Jan 10, 2002
Posts: 60063
    
  65

Jeff Verdegan wrote:If you're using it because you're trying to avoid wasting a few hundred bytes in a map, even though you're going to be looking something up by a key, then, yes, it definitely is.

Exactly this.

Nobody said anything even close to that. Please pay attention to what is actually being said.

And this.

Are you here to have a meaningful discussion or just be argumentative?
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

chen shu wrote:
Jeff Verdegan wrote:I do. I choose the one that appropriately reflects my design intent. So, if I have a handful of objects, and I want to access them by some key--say an "id" or "name" field--then I use a Map. This is the correct approach regardless of whether it's 5 objects or 5 million. The code is easier to write, easier to understand, less likely to be buggy, not noticeably wasteful, and, very importantly, scales well--it may be 5 objects today, but if it's 5,000 next month, my code will still work just as well, with no changes.


Actually,If you just want to find a object using key/value way,you can try TreeMap,Properties.Java provides these classes already.HashMap is not the only choice.

I list all subclasses from JDK here:
All Known Implementing Classes:
AbstractMap, Attributes, AuthProvider, ConcurrentHashMap, ConcurrentSkipListMap, EnumMap, HashMap, Hashtable, IdentityHashMap, LinkedHashMap, PrinterStateReasons, Properties, Provider, RenderingHints, SimpleBindings, TabularDataSupport, TreeMap, UIDefaults, WeakHashMap

I mean we have a lot of choice,why you like HashMap so much?




HashMap is a reasonable default Map implementation. In many cases, any one would work just as well as any other, but we have to pick something, so, in that absence of any other criteria, I'll pick the one that's lightweight, and that offers O(1) storage and retrieval performance.

Additionally, since, for whatever reasons, HashMap *is* the de facto default map implementation, if we use a different one, then somebody reading out code later (including us) may think we had a specific reason to choose that one. Programming is hard enough already without adding more confusion just for the fun of it.

If I need to maintain my keys in sorted order, I'll go with TreeMap. If I need the specific features that one of the other maps offers, I'll use that one. In other words, like I said before, I use the one that's appropriate for my design.
chen shu
Greenhorn

Joined: May 07, 2012
Posts: 13
I am trying to figure out the best practice in Java world?
If you want to save one or two key/value pairs in map?
Which one do you like? TreeMap,Properties or HashMap?
chen shu
Greenhorn

Joined: May 07, 2012
Posts: 13
Using too much memory is a big problem.You can have a look at the benchmark:
http://cppcms.com/wikipp/en/page/benchmarks_all

I like Java very much.But I care the cost of memory and cost when writing my Apps.
Am I right?
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

chen shu wrote:I am trying to figure out the best practice in Java world?
If you want to save one or two key/value pairs in map?
Which one do you like? TreeMap,Properties or HashMap?


Like I said: If I want them to be sorted, TreeMap. Otherwise, for general purpose key/value mapping, HashMap.
chen shu
Greenhorn

Joined: May 07, 2012
Posts: 13
Just saving one or two objects in map.
Actually,you didn't answer my question.
My question is not about general situation,nobody know what is the definition of your general situation.
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

chen shu wrote:Using too much memory is a big problem.


In the time I've spent participating in this thread, I could have made enough money to buy 8GB of RAM. It's not that I'm astoundingly well-paid. It's that memory is cheap compared to the wages paid to even a mediocre developer.

I like Java very much.But I care the cost of memory and cost when writing my Apps.
Am I right?


You are right that you should pay attention to memory use. However, you have to be smart about what is significant and what is not. Choosing an array over a map to save a few hundred bytes when you want to look something up by a key is like not turning on your windshield wipers in the rain because it will use extra gas.

Now, if you're operating in a limited memory environment (as opposed to a desktop app, web app or server-side app), then the scale of hundreds or thousands of bytes may be significant. Even phones have several GB of memory now though. So, know your target environment, and focus your effort accordingly.
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

chen shu wrote:Just saving one or two objects in map.
Actually,you didn't answer my question.


Yes, this exactly answers your question: If I want them to be sorted, TreeMap. Otherwise, for general purpose key/value mapping, HashMap.

My question is not about general situation,nobody know what is the definition of your general situation.


By "general case" I mean without any specific requirements or use case to push toward any other map implementation. What are you not understanding? Have you looked at each of those other implementations? Do you understand what specializations each one offers? If all I want is to look something up by a key, and I don't have any requirements for sorting or reading from a key=value file or anything like that, then, as I already stated at least twice, HashMap is a reasonable default, and the one I would use.
chen shu
Greenhorn

Joined: May 07, 2012
Posts: 13
Yes, this exactly answers your question: If I want them to be sorted, TreeMap. Otherwise, for general purpose key/value mapping, HashMap.


I don't think you are right. I can implement a key/value structure using array if only contains one or two elements.
I saw a lot of people use HashMap to contain one or two objects.
Do you really think it is the best solution,even better than array?


Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

chen shu wrote:
Yes, this exactly answers your question: If I want them to be sorted, TreeMap. Otherwise, for general purpose key/value mapping, HashMap.


I don't think you are right. I can implement a key/value structure using array if only contains one or two elements.


Yes, you can. But there's no good reason to, unless you find yourself in a situation where a few hundred bytes of overhead is causing a real problem.


I saw a lot of people use HashMap to contain one or two objects.
Do you really think it is the best solution,even better than array?


I thought I'd made this quite clear already, several times over: If you are going to look up the items by key (as opposed to just iterating or selecting by index), then, yes, a map is the correct solution.

You appear to have made up your mind already, and you don't seem to be paying attention to what I'm saying, so I'm going to leave this discussion now. Good luck.
chen shu
Greenhorn

Joined: May 07, 2012
Posts: 13
Wow!
Map is just one interface.You could implement it using your favorite structure.
I can implement it using Array.
JDK implements it as tree,hash table,list(properties) or other data structures.
Good luck to you!


Mike Simmons
Ranch Hand

Joined: Mar 05, 2008
Posts: 2969
    
    9
Wow is right.

But probably not for the reasons you have.
James Boswell
Bartender

Joined: Nov 09, 2011
Posts: 973
    
    5

I saw a lot of people use HashMap to contain one or two objects. Do you really think it is the best solution,even better than array?


Yes but how were they accessing the data to retrieve it? By a key? If you use an array, then finding the element you want may not be noticeable (performance-wise) for a few objects but, like what has already been said, what happens when your collection suddenly holds hundreds or thousands of objects?
fred rosenberger
lowercase baba
Bartender

Joined: Oct 02, 2003
Posts: 10918
    
  12

chen shu wrote:I don't think you are right. I can implement a key/value structure using array if only contains one or two elements.
I saw a lot of people use HashMap to contain one or two objects.

What happens if in the future, you need it to contain 30 objects? or 500? or 30,000?

At some point, you're going to wish you has used the HashMap, because you are going to have to re-factor your code. If you has used the HashMap to begin with, it wouldn't matter.

It is virtually impossible to say "We will NEVER EVER EVER need more than one or two objects."

For that matter, if you have one object, why do you even need an array? Do you really declare an array of size 1?


There are only two hard things in computer science: cache invalidation, naming things, and off-by-one errors
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18123
    
  39

chen shu wrote:
Yes, this exactly answers your question: If I want them to be sorted, TreeMap. Otherwise, for general purpose key/value mapping, HashMap.


I don't think you are right. I can implement a key/value structure using array if only contains one or two elements.
I saw a lot of people use HashMap to contain one or two objects.
Do you really think it is the best solution,even better than array?


To be fair, I don't think either side is completely technically wrong here -- as it is all relative.


I agree with Jeff and Bear because I am closer (relatively in situation) to them. To me, a product is pushing deadlines, and has to meet implementation targets. In that regard, the most expensive resource on the project is the developer. Yes, optimizing for cpu speed, memory usage, disk usage, network usage, are all important, but what is more important? The most important is the most expensive, which is developer time, hence, project implementation first, optimization later.

From my POV, if the programmer isn't as important than memory optimization, because the project is overstaffed, then by all means do the optimization. If the programmer isn't as important as memory optimization, because memory resources is in serious trouble, then again, do the optimization; actually, you can even argue that you shoudn't use object orient programming (in this case)..... but if I am trying to make deadlines, I think it is a waste to try to figure out which map is best (just take a best effort guess), or to try to rework the algorithm to make it work with an array.

Henry

Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
chen shu
Greenhorn

Joined: May 07, 2012
Posts: 13
I think it is a waste to try to figure out which map is best

So I suggest that you could use Hashtable instead of HashMap.

What happens if in the future, you need it to contain 30 objects? or 500? or 30,000?

Also you could use vector instead of ArrayList.Because we could use it in multi-thread environment in the future.

For that matter, if you have one object, why do you even need an array? Do you really declare an array of size 1?

Good question.If I was that developer,I would not use array,hashmap or any other container,maybe just a object.

The most important is the most expensive, which is developer time, hence, project implementation first, optimization later.

Optimization later?Do you really never design a good algorithm when programming?By the way,this situation is very easy,we just need select a good container to do something.We just need to design a class as following:
public class Types{

private TreeMap<String,String> values;

public String findType(final String key){
....
}

}

In the feature,if we find the values must contain a large number of pairs,just replace TreeMap with HashMap.Very difficult?
In the feature,if we are in multi-thread environment,change its type to HashTable,it is very easy!
But I don't want to talk about the feature here,just let you know I find a lot of guys really use HashMap to hold one or two objects? And I am very sure that there is no chance to hold more objects.



Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18123
    
  39

chen shu wrote:
The most important is the most expensive, which is developer time, hence, project implementation first, optimization later.

Optimization later?Do you really never design a good algorithm when programming?


Of course I don't mean that !! There is a world of difference between refactoring and optimization. Optimization requires refactoring, but it also requires testing, measurements, etc. Holding off optimization in favor of meeting a deadline, doesn't mean that you pick the worst option, nor does it mean that you don't change it.... but you probably already know that.

Henry
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

chen shu wrote:
I think it is a waste to try to figure out which map is best

So I suggest that you could use Hashtable instead of HashMap.


Why?


What happens if in the future, you need it to contain 30 objects? or 500? or 30,000?

Also you could use vector instead of ArrayList.Because we could use it in multi-thread environment in the future.


Irrelevant. If we need to worry about concurrency, we would use a Collections.synchronizedMap() or a Map from java.util.concurrent (if our design calls for looking up something by a key) or Collections.synchronizedList() or a List from java.util.concurrent if our design calls for a List.

There's no good reason to use Hashtable or Vector unless you have no choice because you're working with legacy code that requires it.

The most important is the most expensive, which is developer time, hence, project implementation first, optimization later.

Optimization later?Do you really never design a good algorithm when programming?


Again: Be smart and pick the optimization that will actually matter. Saving a few hundred bytes for an O(N) lookup over an O(1) lookup is a horrible approach to optimization unless you have an unusual use case with a specific reason to do so

By the way,this situation is very easy,we just need select a good container to do something.We just need to design a class as following:
public class Types{

private TreeMap<String,String> values;

public String findType(final String key){
....
}

}

In the feature,if we find the values must contain a large number of pairs,just replace TreeMap with HashMap.Very difficult?
In the feature,if we are in multi-thread environment,change its type to HashTable,it is very easy!


Easy, but incorrect. We wouldn't choose HashMap over TreeMap just because the number of entries got large, and we wouldn't use Hashtable just because we had a multithreaded environment.

But I don't want to talk about the feature here,just let you know I find a lot of guys really use HashMap to hold one or two objects? And I am very sure that there is no chance to hold more objects.


Creating a map for something that is only intended to ever hold one entry is pointless. So is creating an array for that case. This of course does not mean that we should use an array instead of a Map for keyed lookups when there will only be a small number of entries.

James Boswell
Bartender

Joined: Nov 09, 2011
Posts: 973
    
    5

In the feature,if we find the values must contain a large number of pairs,just replace TreeMap with HashMap.Very difficult? In the feature,if we are in multi-thread environment,change its type to HashTable,it is very easy!


I dont think you are teaching anything new to the majority of people following this topic. And you now seem to be going off on a tangent preaching software design.

I think we can summarise this topic by concluding that the Collections API is extensive and a developer should choose the appropriate type carefully to suit his/her needs. However, just because the OP has seen examples of overusing HashMap, does not mean the entire Java community on this forum do likewise.
chen shu
Greenhorn

Joined: May 07, 2012
Posts: 13
James Boswell wrote:
In the feature,if we find the values must contain a large number of pairs,just replace TreeMap with HashMap.Very difficult? In the feature,if we are in multi-thread environment,change its type to HashTable,it is very easy!


I dont think you are teaching anything new to the majority of people following this topic. And you now seem to be going off on a tangent preaching software design.

I think we can summarise this topic by concluding that the Collections API is extensive and a developer should choose the appropriate type carefully to suit his/her needs. However, just because the OP has seen examples of overusing HashMap, does not mean the entire Java community on this forum do likewise.


You are right.That's what I am thinking. A good developer can select a right container from candidates quickly.I am not a teacher,just tell you what I watched in the past years in Java world.

Actually,in C/C++ word,my colleagues(including me) always make this kind of decision very carefully. Also,cloud server-side and android Java programmers always realize this is very important.
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

chen shu wrote:
Actually,in C/C++ word,my colleagues(including me) always make this kind of decision very carefully.


Oh, so then you don't choose an array over a Map when looking up by key just to save a few hundred bytes.

Glad to hear it.

Also,cloud server-side and android Java programmers always realize this is very important.


Yes, we do realize that it's important to choose the appropriate data structure for the right reasons. Which is why we don't choose an array over a Map when looking up by key just to save a few hundred bytes.
Jayesh A Lalwani
Bartender

Joined: Jan 17, 2008
Posts: 2058
    
  22

In C++ loading a value from an array can be significantly faster than loading a value from a Map simply because arrays are inlined in C Really, if you are all gung ho about prematurely optimizing your code, you should implement an inline array for your object and then build a Map over it .
James Boswell
Bartender

Joined: Nov 09, 2011
Posts: 973
    
    5

Again, I think this topic is deviating from the original point which is the overuse of HashMap in Java - no mention of C/C++. It feels a little as though the OP originally wanted to have an argument/rant with someone but has ultimately agreed with the points being made all along.
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: Overuse Java HashMap
 
Similar Threads
Available Ram...
Weak keys in WeakHashMap
Locking dilema
Vector Vs. Hashtable
Data Caching Best Practices ?