jQuery in Action, 2nd edition*
The moose likes Java in General and the fly likes Basic question about java.util.HashMap Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Basic question about java.util.HashMap" Watch "Basic question about java.util.HashMap" New topic
Author

Basic question about java.util.HashMap

Julien Martin
Ranch Hand

Joined: Apr 24, 2004
Posts: 384
Hello,

Say I want to store the number of copies of books as the value of a HashMap and use the Book object as the key. Let's suppose my Book objects are very large.

My question is as follows: will my HashMap object store the actual book or just use the result of the hashcode function (called on my book objects) in order to "put" a new book together with the number of copies into the HashMap?

Here is how I "put" the book into my Map:



Thanks in advance for your help,

J.
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 14269
    
  21

It will keep a reference to the actual Book object - it will not only store the value of the hashCode function. (That would not work, because hash codes are not necessarily unique). So if your Book objects are very large, and you make a HashMap which has Book objects as the keys, then all those Book objects that you use will remain in memory.

Do you really need to use those large Book objects as keys? Do your Book objects contain the whole contents of a book or something like that? Maybe you want to make a BookInfo objects which contains some identification fields, but not the whole contents of the book, and use that for the keys in the map.


Java Beginners FAQ - JavaRanch SCJP FAQ - The Java Tutorial - Java SE 8 API documentation
Lolu Peace
Greenhorn

Joined: Dec 20, 2010
Posts: 12
Before you put anything in the HashMap, the value and the key objects are instantiated. So your book object is instantiated and its hashcode is generated. The HashMap then uses the hashcode to know in which of its own "hash bucket" to put the book object so that when the HashMap is now to retreive the value associated to it, it searches that "hash bucket" to get your book object and subsequently, values associated with it.
Usefully, you should have a way to persist your HashMap e.g. writing to a file...so that all objects can be read back when you need them and easy retrieval done instaead of building the HashMap all over and over anytime your application loads.

I hope I made sense here.
Aditya Jha
Ranch Hand

Joined: Aug 25, 2003
Posts: 227

@Jesper Technically, does it really make a difference, assuming the whole of Book objects have to be kept in memory. What you described does hold true for databases from the point of view of normalization (or, reducing data redundancy). However, in terms of objects, having a heavy instance as key in a map shouldn't be making operations on that map any slower, I think. Introducing a BookInfo will only add resource usage.

Though, I agree it is a better OOD, and BookInfo might help in some other use-cases. Purely from this problem's perspective however, it will be an overkill, IMHO.
Pat Farrell
Rancher

Joined: Aug 11, 2007
Posts: 4659
    
    5

Julien Martin wrote:Let's suppose my Book objects are very large.


Then your code is very bad.
If its really a Book, then it has an ISBN (International Standard Book Number). You should use the ISBN as your key.



Now you can manage the Book objects in some separate store, and not break your quantity map.
Aditya Jha
Ranch Hand

Joined: Aug 25, 2003
Posts: 227

@Pat Could you please describe how ISBN-as-key approach is better than Book-as-key in concrete terms? If Book class has an immutable property call 'isbn', and uses this property in a properly implemented 'equals' and 'hashCode' methods, why can't a Book instance be a key in a Map, however large it may be?

In fact, having Book instance as key enables us to hide the exact property name on which equality is decided. It could be ISBN, or it could be as simple as a book name, in case we're talking about books for a syllabus of a particular course. The point is, why should an external entity be aware of the fact as to how 2 instances of Book are supposed to be equal?

As I said before, having ISBN (or other such 'key'/BookInfo) has its own advantages in certain (or most) situations. But, talking from strictly this problem's perspective, I don't see a value add in having a 'simpler' or 'lighter' key for the map.
Pat Farrell
Rancher

Joined: Aug 11, 2007
Posts: 4659
    
    5

Aditya Jha wrote:@Pat Could you please describe how ISBN-as-key approach is better than Book-as-key in concrete terms?

As others have written up thread, using the Book instance requires that you have instance in memory. You say that the Book is "big". There are between 700,000 and a million books in print, if you had to have a million "big" things in memory, you can't run it on any practical machine.

Suppose you are building a book ordering system (I've built a couple) they you care a lot about the quantity on hand, and the author's name, ISBN, and cost. Most of the time, perhaps 99% of the time, you really don't care about a lot of of the data that makes the instance "big" like the printing date, publishing date, jacket blurb, photo of author, etc.

In this case, you want to have lightweight data structures of the frequently accessed data, and keep the whole giant glob of bytes off in some store (RDBMS, memcached, etc.). If you use the light key, you are happy, and can always retrieve the "big" object using the key. If you use the whole Book object, you have it, all of it, whether you need it or not.

As a matter of business, while there are about a million books in print, well over 750,000 of them have zero sales in any given year. If you look at the activity logs, you will see that what you really care about are 10,000 popular and new books. For those, you may want the "big" instances in memory. For the rest, got get them from the DBMS.
Aditya Jha
Ranch Hand

Joined: Aug 25, 2003
Posts: 227

I agree. My argument was on the basis of assumption that you want to keep the Book objects in memory anyway (and also, that you can actually hold them). One use-case could be a textual search application over a number of Books. Of course, practicality will dictate how many objects you would actually want to hold in memory.

If not, which is most often the case we encounter, loading only a 'key' is obviously a better choice.
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 14269
    
  21

Aditya Jha wrote:@Jesper Technically, does it really make a difference, assuming the whole of Book objects have to be kept in memory. What you described does hold true for databases from the point of view of normalization (or, reducing data redundancy). However, in terms of objects, having a heavy instance as key in a map shouldn't be making operations on that map any slower, I think. Introducing a BookInfo will only add resource usage.

Though, I agree it is a better OOD, and BookInfo might help in some other use-cases. Purely from this problem's perspective however, it will be an overkill, IMHO.

Whether the objects used as the keys are large or small doesn't really matter for the speed of looking up things (well, it depends on how the equals() and hashCode() methods of those objects are implemented). But speed of lookup was not the point here.

The point is that if you use large objects for the keys of the HashMap, then those large objects have to stay in memory (because the HashMap keeps a reference to them). The HashMap does not only store the hash code of the key objects - it needs the key objects themselves.

If you use small BookInfo objects as keys, then you don't need to keep those large Book objects in memory, but just those small objects - which means you won't quickly get an out of memory error.
 
Don't get me started about those stupid light bulbs.
 
subject: Basic question about java.util.HashMap