Just a theoretical question about Vectors and synchronization. (I'm currently doing some planning for a program I have in mind and so I'm not even at the pseudo-code stage but I just want to make sure that I have a grip on the issues.)
I have read that all methods in the Vector class are synchronized. Therefore, various threads calling add() won't clash with one another because if there's an add() in progress, the late-comers will have to wait until the first one is done until executing their add(). The same is true for remove().
My question is this: is it possible for add() and remove() to clash? For example, suppose I have a thread doing and add() to a Vector, which is currently empty - i.e., the Object being added will be the first and only object in the Vector. Suppose another thread tries to do a remove() on the first element in the Vector. Will it cause the add() a problem? Or does it not "see" the first element until the add() is completed?
Thanks in advance for any feedback. And btw, a *great* site. Very good community.
The another thread which is going to call the remove method have to wait till the current thread which calls add method completes its job and release the lock. Thread acquire a lock on object and not on method or block of code, so when one thread is calling add method on your vector object, as vector is synchronized so no other thread can call any method of vector class on that same object till the previous thread completes and releases the lock on that object.
Joined: Jan 30, 2005
Thank you for the answer, but now I have another question.
I had read that Vectors were only synchronized on a method level, and that they were therefore not thread-safe. From what you are saying, though, it sounds like they would be thread-safe since the entire object (in this case a Vector) is locked. Am I missing something?
The fact that methods of Vector are synchronised means that the Vector itself will not be corrupted when different threads are trying to simultaneously get, add and remove objects. (Except maybe when using Iterator...)
However, very often, the Vector is part of a larger data structure which needs to be kept consistent during multi-threaded access. In that case, the synchronisation of the Vector itself won't help.
In general, people these days steer away from Vector (and Hashtable) and instead use List (e.g. ArrayList) and Map (e.g. HashMap). These are unsynchronised internally. You should consider exactly what synchronisation is needed and implement that.
Slight aside: assert and Thread.holdsLock() can be good for checking that your synchronisation policies, once decided, are adhered-to.
Betty Rubble? Well, I would go with Betty... but I'd be thinking of Wilma.
Thread safety is a rather large concern affecting applications, and that is why there is no simple trick to ensuring it. Hence the caution "using Vectors does not guarantee thread safety".
Absolutely, if you have a single Vector, you are guaranteed that only one method is called at a time on that Vector. At that very fine level, it is thread safe. The problem is that you generally want to perform groups of operations which count as a single logical action (let's say transaction). Problems will occur if multiple threads try to execute these transactions on the same Vector at the same time, unless things are synchronized properly at a coarser level.
Your problems may be exacerbated when you involve multiple vectors, all being accessed simultaneously by multiple threads. Remember - multi-threaded applications are complicated for us humans to grasp properly, and we need to think long and hard about how they will function.
"Thread safety" is a somewhat misleading term. Instead think "will my program do exactly what I think it will, in all circumstances?"
Joined: Jan 30, 2005
Thanks to Peter and Fletcher for their replies.
I think I've got it now. I'll think some more and make sure, but I think I have a grasp on what these discussions mean.
One last question, though. It was stated that now people tend to shy away from Vector and to use ArrayList instead, which is apparently not synchronized at all. What is the advantage in this? It seems a bit like throwing the baby out with the bath-water to say that since the synchronization that's in place isn't perfect then let's not use it at all!
And, again, thanks a lot - this discussion has cleared up a lot for me.
Joined: Jul 01, 2004
People don't generally use Vector anymore because synchronization is slow and because in Vectors every method is synchronized, you can take a serious performance hit. Not only that, but as we've discussed, Vectors are synchronized at the wrong level.
Of course you still want thread safety with ArrayLists, but you can achieve it by synchronizing at a more appropriate level, depending on your needs (perhaps by synchronizing the method in your class that uses the ArrayList). If you want an ArrayList to perform exactly as a Vector, you can add a wrapper (using Collections.synchronizedCollection()). [ February 15, 2005: Message edited by: Fletcher Estes ]
His library is now partly embedded in Java Tiger. I was very charmed about the ReadWriteLock paradigm
Joined: Jan 30, 2005
Thanks for the answer - it's clearer now. I didn't realize that synchronization was so slow. That being the case, it makes sense to take an unsynchronized alternative and only add in the synchronization that's necessary.
Thanks for the link. I had a quick look and it does appear to be interesting. I'll have to spend some more time reading, but it looks like it's a valuable link.
Synchronization has gotten much faster with the latest JVMs, but there's not point in using it if you don't need it. If you know your Vector/Hashtable will be used by only one Thread, then use an ArrayList/HashMap instead.
But when you need thread-safe implementations, oftne you can get better performance using a class built with smarter synchronization. For example, ConcurrentHashMap in JDK 1.5 uses two locks (I'm extrapolating from Doug Lea's work on which it is based): one for reading and another for writing.
See, lookup up a key and getting the size of a HashMap are thread-safe by themselves, meaning any number of threads could be calling those methods without causing problems because they don't modify the structure. So the read lock is acquired when calling those methods, and it doesn't block other threads from acquiring the read lock.
When a thread needs to alter the object -- put(key, value) -- then it must acquire the write lock. To do so, all threads that have the read lock must release their hold before the write lock can be acquired. Once that happens, that thread gets the write lock, which blocks any other lock attempt, both read and write, giving the thread exclusive access while it adds a new key. Once done, it releases the write lock and it's back to normal. In fact, if it's just replacing the value for an existing key, it doesn't need the write lock, just the read lock.
This makes the CHM much more performant in the case of many readers and few writers. If the implementation is good enough, you can tune the locks by giving preference to threads based on whether they want to read or write.
As you can see, simple synchronization across the board on a class works, but it's suboptimal in most cases. Again, though, if you don't need it, why pay the price?