Sometimes thinking of complicated things in a certain way helps to clarify use, even if the underlying mechanisms are constructed differently -- the Bohr model of the atom is a good example. I treat the JVM as being highly complex internally, as vendors are free to do all sorts of strange things to attempt to optimize performance, and thus I assume that there are all kinds of inter- and intra-thread strategies of local vs. glocal cache referencing and copying. So I came up with five rules of thumb that I program by:
* Unless explicitly protected, at some point threads will tend to both clobber and miss each other's values
* 'static' says it's "one copy for all instances", but I treat it as guaranteed "clean" only within a thread
* 'synchronized' guarantees clean across threads, and is "relatively expensive" (performance-wise)
* 'volatile' is also clean across threads, yet is "comparatively inexpensive", but might have a "lag time"
* Atomic/Blocking variable is just as good as 'synchronized' for that variable, and is "mildly expensive"
Remember -- these are the assumptions I use, not a declaration of architectural accuracy. What they do is let me program such that the end-resulting at-worst behavior is guaranteed. Basically, the heavier the synchronization, the more of a performance hit, and the lighter the synchronization, the less sure regarding the timing of noticing changes. But there's one axiom that the multi-threading programmer must know about the JVM, regardless of the vendor:
NO SYNCHRONIZATION = NO INTER-THREAD VISIBILITY GUARANTEE
There are other non-synchronous rules that do guarantee visibility -- things like certain "static final" declarations typical of constants -- but people are asking about regular, mutable variables here that take on multiple states throughout the life of a process, and this one axiom is vital.
Here are some resulting strategies I use by applying these rules of thumb (numbered only for later reference):
1. If I want one copy in a multi-threaded application, and not have to pass a reference, I use 'static'
2. If I expect all (or many) threads to alter the value, I use explicit synchronicity (or Atomic/Blocking)
3. If I expect one to write, but all others to read, I add 'volatile'
4. If I expect read-often, but very, very, very few occassions to write, I add 'volatile'
5. If absolutely every single read/write is utterly vital to the operation at the moment it occurs, I use Synchronized forms
Notice that my use of 'static' has nothing to do with the protection of variable values, only whether I want a single variable representation when I'm not passing a direct variable reference. I use all the other forms when I'm trying to determine the protection and visibility of the values. Notice also on #5 that it refers to 2 concepts: Both the protection of the value, and the timing of the value change. This latter is a big part of the visiblity of changes -- it refers to when there's guaranteed to be inter-thread accuracy of reading.
Visibility is very important: Operations that look atomic as a single Java statement, may require a dozen internal JVM operations to complete! Regardless of how it's accomplished within the JVM, I treat 'volatile' by a model of "visibility protection light" (again, it may not be accurate, but it's useful): I'm guaranteed to see the variable change, but I'm not exactly guaranteed on the timing. My thought-model can be concretely explained by an example (a completely hypothetical example):
Take an operation that we'll assume requires 20 JVM operations to accomplish, and the JVM does it by copying
the memory to a temporary area, operating on it, and then immediately writing the modified value back. Say Thread A requests it, and at operation 7, Thread B asks for the value and the multiprocessing system allows Thread B to quickly jump in to access the original memory. Under 'volatile', I'll get the value from the same location, but Thread A isn't finished updating, so Thread B sees the old value. In this program, eventually Thread B will later ask for the value again, and because Thread A has finished, it will see the new value in that memory location. Under 'synchronized', Thread B will get the result of Thread A's total work on every call, but for now is entirely locked out for the next 13 operations required to finish Thread A's request -- a definite performance penalty, but necessary if the semantics of the work demand it -- because threads are blocked out not just for writing, but also for reading. And it's not just writers blocking other writers, to assure modification protection -- WRITERS ALSO BLOCK READERS, AND READERS BLOCK EVERYBODY AS WELL! Again, for those of you nit-pickers: This is an operational model, not an architectural specification.
So, what would I use for a bank ATM transaction application? Look at Result #5! I'd use 'synchronized' and Atomic/Blocking variables. Period.
But where would I use 'volatile'? Before I give the "perfect use-case" that Daniel requests, let me provide an allegory.
--------
You're at a track meet, and the 100m race is underway. All the contestants are in the 'set' position, and all will wait for the starting gun. The gun goes off, and everybody starts running, but some are a little slower than others on the uptake, not reacting quite as fast to the signal -- but that's OK, it's even expected.
Now, the starter determines after 1 second of running that somebody actually had a false start, and fires a second round. Each contestant reacts in time -- some faster than others -- to the interruption signal, and stops running. But it's OK that some go farther down the track than others, even after the signal, as all the results will be thrown away. One runner could even go to the finish line, which might be entertaining to the crowd, but doesn't hurt anything.
--------
On to the "perfect use-case": I have a program that divides up a massive amount of work that single-threaded would require 18 hours for larger customers. The program allows the administrator to set the number of threads, which can be slightly CPU-heavy, so I suggest 1.6 threads per CPU (so on a 4-CPU system, use 6 threads). There is a main process that sets everything up, including the threads: one Writer, and multiple Read&Compute (Readers). There's one more thread that's running -- the main program acts as a Master Monitor on the progress throughout operation, cleaning up when all the child threads are finished. Here are some variables, direct from this very popular program:
*
ArrayBlockingQueue workQueue: This is created and owned by the Master, and its reference is passed to all threads. Being internally synchronized, each Reader can't clobber the queue's manipulation by any other thread. Futhermore, that same internal synchronization means from a JVM perspective that when the Reader finishes computing and places the result on the queue, the finished work is guaranteed to be visible when the Writer picks it up from the queue. Both data protection and visibility are important here, so I used a Synchronized class form (Result #5). But I didn't need 'static' because the actual variable reference is explicitly passed as a parameter to the thread classes, so there's no need to explicitly declare that there's one copy; they all work on the passed reference.
*
static AtomicInteger mProcReady: Since they divide up the work, I don't want only a subset of threads to succeed; all must be able to get through their initializations. Each thread increments this counter when it's finished initializing and is ready to do work. The Master thread waits and polls until this counter reaches the number of threads it knows it spawned. It's absolutely vital that no increments are lost (see Result #5), because otherwise the Master will wait forever! So it's a shared value that's utilized directly through the variable name "mProcReady" and not through a reference, hence I need to declare 'static'. But it needs absolute protection (and the Master also wants to know quickly when conditions are right: idle time is wasted time!) so I used the Synchronized form of 'Atomic'. I could have used a latch variable to count down, but it's just easy for the Master thread to track the count-up, polling every second. NOTE: You CANNOT use an integer with "++" -- it's not guaranteed as an atomic operation, and thus one thread could clobber another's value in the middle of the multi-operation JVM's often use to implement "++" -- there's no protection! (But I could protect it by writing extra code such that the whole operation was explicitly executed with 'synchronize' -- but why clutter up the code? Why not just use something that internally does that for me anyway?)
*
static volatile boolean mStartGun: Once ready, each child thread goes into a wait-state, polling/sleeping until all the children are ready. This condition is signaled by the Master once it sees the Atomic Counter reach the desired state. In this case, all the children only read this value, and only the Master is changing the value (Result #3) -- and it changes only once, so there couldn't be any "lost changes" due to some ultra-quick intermediate states; it's off, and at some point it goes "ON" and stays that way. And so what if a child thread starts work a second later, either because of its own sleep/poll check, or because the visibility of the change is just ever so slightly slower than an explicit synchronized form? Just like the Track Meet start.
*
static volatile boolean mHaltGun: Throughout the processing (it may take hours), an error counter is tracked. If a threshold is reached, then the whole thing should be killed -- but cleanly. It turns out that the Halt Gun can be triggered by any thread, not just one thread. Like the Start Gun, we need a single Halt Gun (hence 'static'), but in this case there's likely to be only one write to it -- BUT IT'S OK if there's more than one, so 'volatile' by Result #4. Here's an important aspect of the decision: If there indeed occurs that there's more than one write to the variable (two threads hit an error at the same time, and both come to the conclusion at the same time that the threshold is exceeded), THEY CHANGE IT TO THE SAME VALUE. Thus, there can't be any lost intermediate states, because "clobbering" the value would always just set it to the correct desired value in the first place. And so what if any given thread takes a second longer than others to realize the Halt Gun signal went off? But why 'volatile' -- why not use a heavy form of synchronization, just to be sure? Because each thread tests the signal on every
unit of work, and there's billions -- performance counts! No need for each thread to lock out the others just to read this thing 700 million times each! And who cares if due to the relaxed visibility that a thread happens to process just one more than it would have under full lock-out? Again, so what if one Track Meet competitor runs an extra stride further than the others when Halt is signaled?
*
static volatile long mMemCount: I also use 'static volatile' for a progress counter, where only the Write thread changes it (Result #3), but the other threads read it just to periodically flush work so that the system maintains general equilibrium. Again, it's called billions of times, so a less costlier form of work is advantageous. And so what if the Master thread's progress report to the log based on this counter skips a beat? It's a threshold-based reporting entry that contributes no work accomplishment in itself; if due to relaxed visibility the Master saw '499999' when it went in at the precise moment the Write process was changing it to '500000', who cares? The Master will report in another 10 seconds when it sees '500007', because that too exceeds the threshold -- no processing component will lose its vitality and blow up the moment every half-million records are processed, and half-million-and-one shoots by unnoticed. There's simply no need to lock all threads out of reading this value if it's also being modified by the lone thread that does so.
This model works under even the heaviest load, and has proven itself time and again on any size system (1 to 32 processor). I've certainly given more than $0.02 here, and hopefully I've been able to contribute in a small way to answering the curious. And by the way, I don't know if I agree that this is "beginner" stuff -- I feel it's kind of "advanced beginner threading" stuff, but not "beginner Java" or "beginner programming" stuff, which I associate to the "beginner" forum.