aspose file tools*
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes Locking Schemes:  Tactical View 01 Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of EJB 3 in Action this week in the EJB and other Java EE Technologies forum!
JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark "Locking Schemes:  Tactical View 01" Watch "Locking Schemes:  Tactical View 01" New topic
Author

Locking Schemes: Tactical View 01

Javini Javono
Ranch Hand

Joined: Dec 03, 2003
Posts: 286
Hi,
The following is a review of locking schemes in general for comparison
purposes. This is a tactical review. Please feel free to correct
any mis-notions I may have.
The following assumptions are made:
* the database is stored as one, random access file,
* the database is accessed through the class Data.
* low level file read, update, write, and create new record operations
are carried out within only one instantiated class called MicroData
where each method is synchronized. There is no overhead processing
associated with any of these methods in that each method directly reads,
updates, writes, or creates a new record in the file.
* although not explicitly drawn, there exists one instantiation
of the singleton class called Guard, which is involved with
logical record locking (has lock() and unlock() methods and is
multi-threaded). Exactly how it is integrated and used
by Data or MyServer or by some other object is not shown.
* client accessible remote objects are coded on the server side to
behave correctly when multi-threaded.
* client accessible remote objects are coded on the server side to
behave correctly when more than one instance of this remote object
is instantiated on the server.
Overview:

We will disregard MyServer as it is just a middleman, and introduce Guard:

Class Guard is used to get access to each individual shared file record of the shared,
single resource: Data (which uses MicroData (which directly manipulates the DbRaf, i.e.,
the database random access file)).
Class Guard is used for logical record locking. As long as all the software obeys
the rules, then only one thread can ever gain access to a record. This allows
concurrency in that ThreadA can lock and use record 1 while ThreadB can lock and
use record 2. The overall time it takes, from the client's perspective, is thus
minimized for all clients equally.

Before continuing, let's consider are strategic decisions: any client accessible
remote abject can be instantiated more than once, and each instantiation can
be multi-threaded.
So, let's look at an example:

In this short code block, record 1 is being locked, processed, and then unlocked.
In our multithreaded environment, that means that two threads could be working
this same three lines of code at the same time, which could result in a programmer
logic error thus:

Whew! We are fine. While it is true that more complicated variations
of these locking and unlocking schemes can accept a unique client ID, this is not
a fundamental part of understanding the concepts, and we will not deal with the
unique client ID at this time.
Before continuing, let's consider a simple mutex class, which I have grossly
simplified from the Java class given at Doug Lea's web site; you would
never use this as production code, I am using it as an algorithmic example
only:

The Guard can be set up to use any of the following collections, as examples
(as this list is not complete):
1. array[]
2. Vector
3. ArrayList
4. HashMap
5. WeakHashMap
And, the collection can be used in one of two ways:

Regardless, we do not want to sequentially search through the collection, and
that is why linked list-like collections are not being considered. But,
random access and keyed lookup collections should offer us sufficient speed.
For some reason, no one ever seems to mention that the collection could be
an array. So, I've added it to the above list as an option. Basically, I think
that an array would be fundamentally similar to a unsynchronized ArrayList
(whose size would not change). We will consider these very similar, and thus
only consider the ArrayList.
When evaluating any given scenario, consider the memory footprint (will all
the file record mutex-like objects always reside in memory, or just those
records that are currently locked), and how often and for how long the
complete collection, if ever, will be completely locked down (i.e., synchronized
on itself for a code block). Finally, consider how easily the representation
can handle the file growing in size.
----------------------------------------------------------
Strategy 1: Collection Size Equals Number of File Records
----------------------------------------------------------
For simplicity, we'll use a random access-like collection for
this example (array[], Vector, ArrayList):

The above is elegant in its simplicity. It is readily readable,
concise, and very clear.
We can also use a look-up collection, such as a HashMap wherein
the code becomes only a little bit more messy:

Strategy 1 algorithms, as given above, are particularly appealing
because they are easy to understand; and, admittedly, I initially
fell in love with them because they are 100% accurate, and you gain
a lot of confidence because they are so easily comprehensible (although
it is true I spent a thread or two trying to understand the details
of multi-threading using the Mutex class in previous threads here).
Nevertheless, I no longer consider Strategy 1 solutions viable
for the following reasons:
1. As the file continues to increase in size:
1a. The memory footprint of the Guard increases.
1b. To grow the Guard object is difficult and could be quite potentially
time consuming, not to mention involving complex coding.
2. I may want to have the option of letting the client directly manipulate
Data as a remote object, in which case a WeakHashMap might be desired
(note: although I haven't though too much about it until just now,
one could explore whether the above Strategy 1 could be used with a
WeakHashMap so that dead, unresponsive clients could be dealt with).
So, while I love their clarity, Strategy 1 solutions are more what
I consider to be learning tools; that is, I've never implemented,
coded, and tested a Strategy 1 solution, but I understand it completely
because it is so intuitively straightforward.
Once you see how simple it is, then it is easier to go to the next,
more realistic solution phase: Strategy 2 solutions.
-----------------
End of Strategy 1
-----------------
------------------------------------------------------------------------------
Strategy 2: Current Collection Size Equals Number of Currently Locked Records
------------------------------------------------------------------------------
There are two sub-categories:
2a. where locking is against a collection and
2b. locking is against individual locks held within the collection.
The difference is in the use of notify() and notifyAll(). The memory footprints
are essentially the same. You can decide whether one algorithm may have more
contention over shared resources than another.
---------------------------------------------------------------------------------------------
Strategy 2a: Current Collection Size Equals Number of Currently Locked Records: notifyAll()
---------------------------------------------------------------------------------------------

What kind of algorithm is this I have given above? Well, basically I realize
that we no longer have a one to one relationship between each record in the
file and a simple, elementary mutex. Instead, we have, what I have named
an ElasticMutex: an object whose size changes dynamically, and this is fundamentally
different from Strategy 1 that was given above. So, it makes sense to consider
the ElasticMutex in isolation, particularly as this is my first study on this
issue (we can always condense the algorithm later).
For the ElasticMutex, it makes little sense to use a random access-like
collection because we will not be using an integer index; instead, we will be keying
off the existance of objects, so a hash table is about the only reasonable
choice. Since the very existence of a (key, value) pair means that a particular
record is locked, value will simply be set to the empty string, everyValue (we
can later, consider extending this to so that (key, value) == (client ID, record number)).
To create the ElasticMutex, and to keep the algorithm as clear as possible,
because production code will be adding more complexity, you will see,
if you jot down the Mutex example code given above, that I have formulated
ElasticMutex so that its structure is exactly the same as the
Mutex class: that is, where the mutex class operated on the simple boolean
"held", I substitute an object, HMutex, with a comparable operation:
1. while(held) becomes while(hMutex.isHeld(recordNumber))
2. held = true becomes hMutex.takeHold(recordNumber)
3. held = false becomes hMutex.letGo(recordNumber)


---------------------------------------------------------------------------------------------
Strategy 2b: Current Collection Size Equals Number of Currently Locked Records: notify()
---------------------------------------------------------------------------------------------
Phil posted this idea just recently:
http://www.coderanch.com/t/185100/java-developer-SCJD/certification/Locking-Notification-issues
While I work through it now, I am not looking at
Phil's posting; so, I may make an error or come up with a less efficient first draft
than that presented by Phil.
This algorithm is a refinement of the algorithm just given above. The main difference
is this:
1. In 2a, all locks were stored in one object called NLock which, let's say, is a HashMap.
Thus, to be certain that a waiting thread was able to leave the wait state, the unlock()
method had to call the notifyAll() method (otherwise, if you called notify() only, you
may awaken a thread who wants currently locked record M, while you have just unlocked
record Q).
2. In 2b, all locks are still stored within one object call NLock which, let's say,
is a HashMap; however, each value of the hash table represents a linked list of
lock objects all relating to one, unique record in the file.
This algorithm 2b is particular interesting in that the while clause and wait() call
looks like this symbolically:

That is, the "syncObjectInLinkedList" refers to a particular element in the linked list.
But which particular element it refers is not specified in the middle three lines
of code. Thus, what "syncObjectInLinkedList" means depends on which thread you are!
I thought that was an interesting concept.
We now have N syncObjectInLinkedList waiting to use a locked record. Each of these
syncObjectInLinkedList has one thread which is waiting on it.
The unlocking code takes the first syncObjectInLinkedList from the linked list,
and calls notify():

In otherwords, in the unlock() method, the thread exiting dictates exactly which
syncObjectInLinkedList it will choose, synchronizes to it, and calls notify, so
that the one thread waiting on this particular syncObjectInLinkedList, can then
proceed to lock the record.
Phil liked this algorithm for at least two reasons:
1. He uses notify() instead of notifyAll(),
2. The first thread requesting the lock, will be assured of being the first
thread to obtain the lock.
In a sense, this algorithm combines two features from algorithms 1 and 2a:
1. From algorithm 1, we use notify().
2. From algorithm 2a, our storage only accounts for locked records, and thus
does not keep an ArrayList for every physical record simply existing within the
file.
Now at this stage I will depart from Phil's presentation (if I haven't already
done so). For I see a line of inquiry that I want to follow.
We note that Phil's linked list is really a mutex, but a special kind, a kind
where the programmer dictates specifically which one thread within the waiting
threads on the mutex will be awakened.
I will simplify the algorithm by saying that we can have a Mutex, or a
LinkedListMutex; and, we will start with a Mutex only, and later, since
Mutex is a separate class, we can change its implementation later if we
want to make it into a LinkedListMutex.
The goal is this
1. To have an increasing and decreasing collection of mutex's representing
the number of locked records in the file (but not to have a collection
of mutex's representing every records which currently exists within the file).
2. To be able to use notify() instead of notifyAll().
We will delay the goal of dictating exactly which thread will be awakened
to take control of any given Mutex.
As long as our HashMap holds a Mutex for a specific record, this should
be possible.
We start with the dumbed-down, algorithmic version of Mutex with its
use of notify() which is a component we desire.

Guard will use an ElasticMutex, and ElasticMutex
will be responsible for storing one Mutex object
for every record. ElasticMutex will not store
Mutex objects when the record is not locked. We
obtain the unique Mutex we want simply by specifying
an argument for recordNumber.


So, assuming the above algorithm is logically correct,
we now have accomplished three goals thanks to Phil's
posting:
1. We can use notify() instead of notifyAll().
2. We can store only those Mutex objects which relate to
locked records, instead of storing a Mutex for every physical
record in the file.
3. And, though I'd have to check Phil's posting again, I think we
have reduced the contention for the HashMap, in that in the
above algorithm, the HashMap is only locked down to either do
a put or a delete.
In short, if the above assertions turn out to be true, this most
likely will be the algorithm I can be happy with. And, again,
I thank Phil for posting and motivating my thinking along these
lines.
However, I've noticed that there may be problems in that while
one thread is using the mutex but hasn't yet locked it, another
thread can be removing the mutex from the HashMap. So, the
above algorithm still needs some thinking. More than likely,
I'll end up having to synchronize on the HashMap more than
I had originally thought. And, the above algorithm, as of now,
is an idea, the algorithm is not yet completed due to this
problem.
New Notes: if it turns out that the only solution is to lock
down the HashMap more often, then I lose some excitement
over this algorithm, for then a real contention problem can
start to arise. What I might do in this case, is just let the
HashMap grow until, say 1,000, 10,000 or however many
records, then periodically a background thread will start
up, lock the HashMap once, go through it, and delete
all Mutex's that have no waiting threads. So, the outline
would be:
1. Lock the database, i.e., make sure no threads are allowed
to attempt to lock any records.
2. Lock the HashMap and remove Mutex's which have no
waiting threads, which is the same as deleting all the records
in it, actually.
-----------------
End of Strategy 2
-----------------
Keep in mind that the above "coding" examples are algorithms, not
production-ready code by any means (again, never before implemented,
coded, and tested). What I have shown, albeit quite accidentally
during my studies, is how we can come up with a set of
classes--Guard, ElasticMutex, and HMutex--based upon the
simple, straightforward Mutex class written by Doug Lea. Thus,
our construction remains about as crystal clear as it can be.
Then, due to Phil's posting, I expanded on my interests regarding
Strategy 2b which looks like it might be my mutex solution of choice.
Thanks,
Javini Javono
[ February 09, 2004: Message edited by: Javini Javono ]
[ February 09, 2004: Message edited by: Javini Javono ]
Javini Javono
Ranch Hand

Joined: Dec 03, 2003
Posts: 286
Hi,
I added a new Strategy 2b (near the end of the above posting)
which is a variant based upon Phil's recent posting:
http://www.coderanch.com/t/185100/java-developer-SCJD/certification/Locking-Notification-issues
Thanks,
Javini Javono
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Locking Schemes: Tactical View 01
 
Similar Threads
lock/unlock
Threads 002
Implementing Lock/Unlock
Synchronization of public methods in DVDDatabase
Lock question!