wood burning stoves 2.0*
The moose likes Java in General and the fly likes Double-Checked locking and Lazy Instantiation Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Double-Checked locking and Lazy Instantiation" Watch "Double-Checked locking and Lazy Instantiation" New topic
Author

Double-Checked locking and Lazy Instantiation

Chan Ag
Bartender

Joined: Sep 06, 2012
Posts: 1000
    
  16
Hi,

I've been studying minimal synchronization techniques from the book 'Java Threads', third edition ( Page 86 ) where the authors explain why DCL is discredited and hence should not be used. I think I cannot copy here the small code snippet I have in the book because of copyrights. However I have seen a similar thread that has the similar code. So I will paste the link of that here. Here it is.

http://www.coderanch.com/t/610144/java/java/Double-check-Idiom-lazy-initialization#2786666

Let us just also consider that the object to be constructed is non volatile.

Let's, for the purpose of understanding the question that follows, number the statements.

1st statement - First check if the object ( the object to be constructed) is null.
2nd statement - the block statement that synchronizes on this object.
3rd statement - second check of if the object is null within the synchronized block.
4th statement - construct a new object if 3rd statement is true and store the reference.

Now coming to my question - ok, the goal here is to prevent synchronization once the object is initialized. The authors state that the value for the object can be stored before the constructor is called.

So could case 1 happen?

Case 1:
Say I have two threads on this object. The first one is running within the synchronized block but hasn't yet constructed the object while the second one is within the first check (statement 1). Yes, the object is null at this point and the second thread is now waiting for the lock on the this object to be released. But the second thread cannot enter the synchronized block until the first thread that is constructing the object is done constructing the object. Once the first thread is done, the second thread enters the synchronized block ( yes- we have failed at the goal of avoiding synchronization here ). The second thread then checks if the object is null ( the third statement ) and does not construct a new object because it is already constructed. Could this be possible?

Case 2
Could it be possible that because of local caching ( object is not volatile ), when the second thread checks if the object is null in statement 3, it gets a true and reconstructs the object?

But authors are probably talking about the case 3 mentioned below.
Case 3
Somewhere in the discussion, the authors also imply ( I'm not sure if my interpretation of what the authors mean is right here. Hence I say imply ) that it's possible that the second thread gets the reference of the object before the first thread is done constructing the object. The second thread can then call other methods that can be called on the constructed object ( obviously methods cannot be called if object was still null. Hence the authors must mean the reference the object would get once it is constructed is leaked before the construction is complete. Or am I missing something here? ) although the first thread is not yet done constructing the object. I could not understand how this is possible.

Since construction is done within a synchronized context, how can the reference be leaked before construction? And isn't reference assigned as the last step of the object construction process? Is there something I'm missing here?

Thanks,
Chan.

Chan Ag
Bartender

Joined: Sep 06, 2012
Posts: 1000
    
  16
Hi,

I clicked on the link provided by Paul in the other thread --> http://en.wikipedia.org/wiki/Double-checked_locking#cite_note-IBM-4 and here is what the link has.

The following code is what we could also consider for my first post ( It's taken from the wikipedia link).




"Intuitively, this algorithm seems like an efficient solution to the problem. However, this technique has many subtle problems and should usually be avoided. For example, consider the following sequence of events:

1. Thread A notices that the value is not initialized, so it obtains the lock and begins to initialize the value.

2. Due to the semantics of some programming languages, the code generated by the compiler is allowed to update the shared variable to point to a partially constructed object before A has finished performing the initialization. For example, in Java if a call to a constructor has been inlined then the shared variable may immediately be updated once the storage has been allocated but before the inlined constructor initializes the object.[4]

3. Thread B notices that the shared variable has been initialized (or so it appears), and returns its value. Because thread B believes the value is already initialized, it does not acquire the lock. If B uses the object before all of the initialization done by A is seen by B (either because A has not finished initializing it or because some of the initialized values in the object have not yet percolated to the memory B uses (cache coherence)), the program will likely crash."

So it is the case 3, that is mentioned in the wikipedia. Could somebody please explain what is meant by the following part.

"in Java if a call to a constructor has been inlined then the shared variable may immediately be updated once the storage has been allocated but before the inlined constructor initializes the object."

Could someone please help me with it.

Thanks,
Chan.

Tony Docherty
Bartender

Joined: Aug 07, 2007
Posts: 2173
    
  47
"in Java if a call to a constructor has been inlined then the shared variable may immediately be updated once the storage has been allocated but before the inlined constructor initializes the object."

Could someone please help me with it.

It is talking about compiler optimisations. Inlining of code is when the compiler replaces a method call with the body of the code. This done to improve performance often at the expense of program size.

I don't know enough about the inner workings of the object construction process to say specifically why this can cause the given problem but it is possible to envisage a scenario where rather than storing the reference to newly created but uninitialized object in a JVM local variable which is only assigned to the object variable on constructor completion, the in-lined code uses the object variable to store the reference as soon as it is created. Hence it can have a reference to a created but uninitialized or partly initialized object.
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7551
    
  18

Chan Ag wrote:Could someone please help me with it.

I'm afraid I'd have to go back to notes I made ages ago on this subject to explain it fully, because I have to do it every time I start thinking about this stuff in any depth.

But one thing to remember is that in an unsynchronized (or non-volatile) situation, the compiler is allowed to re-order instructions (and, I believe, even rationalize them) if it wants to, providing there's no logical difference in the outcome - and some modern optimizing compilers are VERY good at it.

Basically, synchronization and volatility inhibit that process in certain ways, ensuring that, in terms of visibility, statements are ordered exactly the way you wrote them.

If you want more information, you might try Googling "happens-before".

Winston

Isn't it funny how there's always time and money enough to do it WRONG?
Articles by Winston can be found here
Chan Ag
Bartender

Joined: Sep 06, 2012
Posts: 1000
    
  16
Thanks Tony, and Winston. Your responses coupled with Jeff's and Mike's ( in the other thread ) and the topic content in the book gives me a fair amount of information to develop an initial understanding of it.

Inlining of code is when the compiler replaces a method call with the body of the code. This done to improve performance often at the expense of program size.


it is possible to envisage a scenario where rather than storing the reference to newly created but uninitialized object in a JVM local variable which is only assigned to the object variable on constructor completion, the in-lined code uses the object variable to store the reference as soon as it is created. Hence it can have a reference to a created but uninitialized or partly initialized object.


Thanks. This must be what they are referring to. For final objects, init blocks might return the reference even before the construction is complete-- one of simple things that came to my mind.

Basically, synchronization and volatility inhibit that process in certain ways, ensuring that, in terms of visibility, statements are ordered exactly the way you wrote them.

If you want more information, you might try Googling "happens-before".


I had recently taken a look at JLS to understand all the directly, explicitly stated cases that would create a happens before relationship ( Maxim had suggested I do the same). It was quite helpful. But I guess I need to read it again ( was hardly just two days back ). I think I will also make notes on this subject while studying. God it's so <not explicit> in some cases. Thanks..

Thanks,
Chan.

Maxim Karvonen
Ranch Hand

Joined: Jun 14, 2013
Posts: 101
    
  10
Somewhere in the discussion, the authors also imply ( I'm not sure if my interpretation of what the authors mean is right here. Hence I say imply ) that it's possible that the second thread gets the reference of the object before the first thread is done constructing the object. The second thread can then call other methods that can be called on the constructed object ( obviously methods cannot be called if object was still null. Hence the authors must mean the reference the object would get once it is constructed is leaked before the construction is complete. Or am I missing something here? ) although the first thread is not yet done constructing the object. I could not understand how this is possible.


You are correct. Reference may leak-out. And this have a practical explanation.

Consider some multi-processor machine. Each processor have own cache. There is no cache coherency protocol (for example, for efficiency reason in a distributed NUMA setup). There may be no write-through for write operations (same efficiency reasons).

Let's assume your code snippet without a volatile.

1. Thirst thread executes getHelper method.
2. Thirst thread enters a synchronized block.
3. Thirst thread creates a new instance of a helper. Writes it into a helper variable. But does not leaves a synchronized block yet (JVM does not request to flush caches yet).
4. Second thread enters a getHelper method.
5. Second threads reads Helper field from a main memory. It can see it non-null because it may be flushed by a CPU executing first thread (of course, it may be not flushed, but this is not an interesting case).
6. Second thread attempts to use some fields in Helper. But these fields may be uninitialized! These value may be not flushed yet from a first thread's CPU. Or maybe second thread's CPU got these values from a local cache (it is JVM-unaware and there were some "empty" object at these address sometime before).
7. First point flushes caches to a main memory.

It is an imaginary machine, of course. Why such delay between 3 and 7? Maybe OS process kicks-in. Or a CPU was overheated and was idle for some time. Why only a helper was flushed to a main memory? Maybe it was an OS once again (and cache line used by helper field was used to something else).

How do volatile solves that? First, it does require to flush all previous writes before volatile value is written. There may be a separate cache flush before, or it may be a single "write-and-flush" instruction to set some variables.
Second, it requires a thread to read some values from a main memory and never use a local cache. Maybe a timestamp-based mechanism (all cache lines preeceding a volatile write should be invalidated or just not used) or just a full cache flush.

Also I want to emphasize one points in JVM specification using slightly modified "setup". There may be no main memory at all. CPU may take values from it's own cache/memory or request it from other processes. Unless it is specified explicitly, it may arbitrary choose to use it's own values or to request them from other CPU. In this case things became a little bit more complicated. There is no main memory but there are volatile values. What do they do? They perform a some kind of timestamp-based synchronization. Each volatile write (and synchronized block leave) registers an event with a timestamp, object (either field or object under synchronization) and source CPU in a global "visibility controller". Such registrations are performed in a sequence. Each volatile read (or synchronized block enter) loads timestamp and source for a corresponding object (volatile variable or object used in synchronized block) from a "visibility controller". Then it loads from an "event source CPU (CPU that performed a volatile write)" all it's local timestamps and invalidates it's (processor executing read) stale values (and maybe replace it with a new values). Now you can see, why it is important to synchronize on a proper object to ensure visibility. If you synchronizes on a wrong object, you invalidates wrong cache lines (and may forget to invalidate some required lines!).

Please, be warned that this model is not equivalent to a Java memory model! There may be some correct Java executions, which cannot be modelled on this "machine" (some advanced reordering or complex optimizations (those related with a causality requirement)). But all executions possible on this machine (excluding external actions) are correct according to a java memory model. So if we found some misbehaviour in this hypothetical machine then we may be sure that same misbehaviour may occur in JVM. But not other way. Correct execution on hypothetical machine does not guarantees correct java behavior. Why use this model? It can easily demonstrate some common mistakes in understanding of java memory model. And it is easier (for me, of course) to validate some demo/tutorial program against this model first and only when validation/verification is successfull involve a full-blown java memory model.
Chan Ag
Bartender

Joined: Sep 06, 2012
Posts: 1000
    
  16
Thanks, Maxim. It's a great, great explanation. I don't think I would have found this anywhere else. The time-stamp mechanism very well explains the happens before that JLS specifies. None of the links I have browsed so far have had such a detailed and convincing description of the whys I've had. Thank you so much.

Best Regards,
Chan.
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7551
    
  18

Chan Ag wrote:God it's so <not explicit> in some cases.

Actually, I think it's fairly explicit. Indeed what synchronization does is pretty simple. It's the logical situations that can arise when you don't do it that can get very complicated.

I did database design early on in my career (invaluable, BTW) and I always think of it as a bit like a transaction: Transactions themselves are simple, but the situations that can arise (particularly deadlocks) AREN'T.

Basically, the best thing to assume is that in a multi-threaded environment another thread can request the same thing (or execute the same method) as yours AT ANY TIME - it's probably not quite true, but if you assume that it's so, you're unlikely to go wrong.

So with any logic of the form:
if (some condition is true)
   do something

another thread can intervene between the if and the do.

Which is why I like the Tell, Don't Ask paradigm.

Winston
Chan Ag
Bartender

Joined: Sep 06, 2012
Posts: 1000
    
  16
Very nice article, Winston. Thanks for sharing it. I liked it very much.

Personally, this is one area ( along with many others ) that I need to work on the most ( I think so ).

Chan.



Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7551
    
  18

Chan Ag wrote:Very nice article, Winston. Thanks for sharing it. I liked it very much.

You're most welcome.

Winston
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
 
subject: Double-Checked locking and Lazy Instantiation
 
Similar Threads
Synchronizing Singletons .... Kyle, could u pls enlighten
multi processor and double checked locking
Why String class immutable
Question regarding Singleton pattern
Double check Idiom for lazy initialization of instance fields