File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Threads and Synchronization and the fly likes Synchronizing a Hash Map accessed by two threads. Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Threads and Synchronization
Bookmark "Synchronizing a Hash Map accessed by two threads." Watch "Synchronizing a Hash Map accessed by two threads." New topic
Author

Synchronizing a Hash Map accessed by two threads.

Bimal Ram
Greenhorn

Joined: Apr 09, 2007
Posts: 1
I have a Hash Map to which one application will insert objects. There is another thread which access this Hash Map, do some processing on the objects and ultimately removes the object from the Hash Map. How can I synchronize the Map insertion and removal with out causing a concurrent modification exception?
Nitesh Kant
Bartender

Joined: Feb 25, 2007
Posts: 1638

If you are on JDK 5 or after that you can use ConcurrentHashMap. This is an efficient implementation of a concurrent hash map using different locks for different portions of the HashMap. This reduces the contention of the lock and hence increase throughput tremendously as compared to a synchronized Map.
For JDK versions before JDK5 there is a backport of the same.
If you do not want to use the above then you can use Collections.synchronizedMap() method to create a synchronized map. The new map will be backed with the original map and all the access will be synchronized on a single monitor. Also, you have to do external synchronization while iterating over the map entries. (See the javadoc for details.)


apigee, a better way to API!
Anirudh Vyas
Ranch Hand

Joined: Oct 23, 2006
Posts: 93
I don't agree with ad-hoc usage of ConcurrentHashMap, in words of the java-doc for this utility, it clearly says that, even though all operations are thread safe, retrieval operations are not thread safe at all, in fact, while retrieving some data, you may experience a concurrent update going on ( that is retrieval overlaps with updates of the table (Hash Table maintained by this class that is).

if you see the code for size( ) method in this class, you will know what i am talking about ... it will basically run through the map in a loop with a LOCKING_THRESHOLD and then understandably ( that spelling sounds funny but anyways ), obtain a lock to get and return a size.

There are certain ways to use concurrent HashMap effectively; new concurrent classes don't mean that you "assume" that everything takes care of synchronization blues for me ...

Just some thoughts ...

Regards
Vyas, Anirudh

Addendum : Nice catch with the backport, i wasn't aware of that ... thanks

[ February 18, 2008: Message edited by: Anirudh Vyas ]
[ February 18, 2008: Message edited by: Anirudh Vyas ]

Vyas, Anirudh
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
[Anirudh Vyas]: I don't agree with ad-hoc usage of ConcurrentHashMap, in words of the java-doc for this utility, it clearly says that, even though all operations are thread safe, retrieval operations are not thread safe at all

No, it doesn't. It says that retrieval operations do not entail locking. That's not the same as not being safe. They very clearly just said that all operations are thread safe. To say that retrievals are unsafe would be contradictory.

[Anirudh Vyas]: in fact, while retrieving some data, you may experience a concurrent update going on ( that is retrieval overlaps with updates of the table (Hash Table maintained by this class that is).

Yes, overlap can occur - but that's not unsafe, not the way it can be on a HashMap for example, where you could get an unexpected NullPointerException, or fail to get() an object that really is in the map, because the array was being resized at the time. No, overlap on a ConcurrentHashMap means much more manageable things. One, it means that if you've got a method that represents a series of actions, such as putAll(), then other methods may observe the map in a state where some of those actions have occurred, but not all of them. If that is a problem, you shouldn't use CHM - but in my experience it generally isn't a problem. Two, overlapping actions may create lock contention in some cases, which basically means that the method call will be slower than you might like. It's still thread-safe though.

[Anirudh Vyas]: if you see the code for size( ) method in this class, you will know what i am talking about ... it will basically run through the map in a loop with a LOCKING_THRESHOLD and then understandably ( that spelling sounds funny but anyways ), obtain a lock to get and return a size.

I've seen the code, but don't understant the problem. The size() code first attempts a few times to obtain the size without locking, because this is quicker. If it fails, then it resorts to locking, which is slower (and which will slow down other put() operations). Either strategy works, safely, but in cases of high thread contention, full locking can be slow. Which may be a problem, sure - but that's just part of the tradeoffs you need to compare against other solutions. High thread contention for mutable data usually creates slowness, one way or another; we just look for the best way to manage it, depending on our specific needs.

[Anirudh Vyas]: There are certain ways to use concurrent HashMap effectively; new concurrent classes don't mean that you "assume" that everything takes care of synchronization blues for me ...

I agree with this sentiment in general, as I have long distrusted the very term "thread-safe" since it is routinely applied to uselessly-synchronized classes like Vector, where you really need additional synchronization in order to do anything useful. In general though, I think CHM does a much better job than Vector (or Hashtable) of providing useful functionality that does not require additional locking to guarantee safe useful behavior. It's not the best way of handling all your Map needs in a concurrent environment, but it handles many of them, and is worth considering, I think.
[ February 18, 2008: Message edited by: Jim Yingst ]

"I'm not back." - Bill Harding, Twister
Anirudh Vyas
Ranch Hand

Joined: Oct 23, 2006
Posts: 93
Retrieval operations (including get) generally do not block, so may overlap with update operations (including put and remove).

Retrievals reflect the results of the most recently completed update operations holding upon their onset. For aggregate operations such as putAll and clear, concurrent retrievals may reflect insertion or removal of only some entries.

Similarly, Iterators and Enumerations return elements reflecting the state of the hash table at some point at or since the creation of the iterator/enumeration. They do not throw ConcurrentModificationException. However, iterators are designed to be used by only one thread at a time.

Above statements tell me that its just NOT a fairy who's going to solve all my problems just by using it in place of a regular hashMap; so yea.

Vyas
[ March 01, 2008: Message edited by: Anirudh Vyas ]
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18896
    
  40

Above statements tell me that its just NOT a fairy who's going to solve all my problems just by using it in place of a regular hashMap; so yea.


I don't think anyone here is suggesting that by going from a unsynchronized hashmap to a concurrent hashmap is going to magically solve threading issues. This is why everyone added some qualifications to their responses.


Most of the pushback that you have gotten, is from your statement that parallel writes and retrievals, is not threadsafe. Keep in mind, threadsafe doesn't guarantee correctness. It just means that it won't generate random exceptions, or spin off into an endless loop, when it is used in parallel.

The concurrent hashmap is threadsafe -- but that doesn't mean it will solve race conditions which exist in logic. Or even race conditions between calls to the write/retrievals.

Henry
[ March 01, 2008: Message edited by: Henry Wong ]

Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Anirudh Vyas
Ranch Hand

Joined: Oct 23, 2006
Posts: 93
Hmmph. Point well taken and agreed

The only reason i mentioned this is because i have seen too many people say it like : Oh well use Concurrent HashMap and your done ...


Regards
Vyas, Anirudh
Billy Tsai
Ranch Hand

Joined: May 23, 2003
Posts: 1304
so what the hell do we do(HashMap/ConCurrentHashMap or anything else?) when we need to stored tens of thousands of data records while having multiple thread accessing it and might even be updating a certain value via the key?


BEA 8.1 Certified Administrator, IBM Certified Solution Developer For XML 1.1 and Related Technologies, SCJP, SCWCD, SCBCD, SCDJWS, SCJD, SCEA,
Oracle Certified Master Java EE 5 Enterprise Architect
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24187
    
  34

The problematic operations are really only the ones that give you an iterator over the whole map (or its keys or values). If the map is just being used as a dictionary by many threads then either Collections.synchronizedMap() or ConcurrentHashMap would work fine; ConcurrentHashMap may be more efficient.

If, on the other hand, some threads need to iterate over the whole map while other threads might be calling get/put, then you actually have to think about the problem. What does it mean to iterate over a changing collection. Maybe you want to block all access during the iteration? Then use synchronizedMap() and synchronize the whole loop on the Map. Maybe you don't really care if some items are missed, or some items are removed after being iterated over but before the iteration is over? Then just use ConcurrentHashMap.


[Jess in Action][AskingGoodQuestions]
Ed Dh
Greenhorn

Joined: Mar 22, 2007
Posts: 4
Hello,

At application startup I need to load a couple of Maps which are thereafter used by multiple threads.
That is, a request comes in and the loaded Maps are used to find out whether they do or don't contain a particular key.
If the key is found, the object behind the key, is associated to the object being validated.

At times the content of the Maps change. I don't want to restart my application to reload the new situation.
Instead I want to do this dynamically.

However, at the time the Maps are re-loading, concurrent read requests on those Maps arrive.

How can I implement this in the most performant way avoiding unexpected exceptions like nullpointerexceptions (because certain objects are being removed from the collection at the time when the read requests arrive) ?
How can I implement this in the most performant way when I would really like to block all read requests until the reload operation has been fully completed ?

Thanks,
E
Javin Paul
Ranch Hand

Joined: Oct 15, 2010
Posts: 281

Bimal Ram wrote:I have a Hash Map to which one application will insert objects. There is another thread which access this Hash Map, do some processing on the objects and ultimately removes the object from the Hash Map. How can I synchronize the Map insertion and removal with out causing a concurrent modification exception?



you can do this by using either SynchronizedMap i.e. by making your hashmap as synchronized e.g. java.util.collections.synchronizedMap(youMap) or by using hashtable.

to read more you can see here. http://javarevisited.blogspot.com/2010/10/what-is-difference-between-synchronized.html


http://javarevisited.blogspot.com - java classpath - Java67 - java hashmap - java logging tips java interview questions Java Enum Tutorial
Steve Luke
Bartender

Joined: Jan 28, 2003
Posts: 4181
    
  21

Javin Paul wrote:
Bimal Ram wrote:I have a Hash Map to which one application will insert objects. There is another thread which access this Hash Map, do some processing on the objects and ultimately removes the object from the Hash Map. How can I synchronize the Map insertion and removal with out causing a concurrent modification exception?



you can do this by using either SynchronizedMap i.e. by making your hashmap as synchronized e.g. java.util.collections.synchronizedMap(youMap) or by using hashtable.

to read more you can see here. http://javarevisited.blogspot.com/2010/10/what-is-difference-between-synchronized.html


First, the best options have already been discussed:
1) ConcurrentHashMap
2) Collections.synchronizedMap()

Second, I don't think Hashtable is ever the right answer in current versions of Java.

And finally, if I re-read the request I am not sure if any of these 'simple' solutions is the right answer - we may need more information about the sequential consistency required between the 'access the Hash Map', the 'processing the objects', and finally the 'removes the object' steps. In most cases I would lean towards ConcurrentHashMap (taking advantage of some of its 'optimistic' operations for the removal of the object at the end), but there may need to be further synchronization. I don't think we will get that information, though, given the original post is about 3 years old.


Steve
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Synchronizing a Hash Map accessed by two threads.