aspose file tools*
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes Threads 002 Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark "Threads 002" Watch "Threads 002" New topic
Author

Threads 002

Javini Javono
Ranch Hand

Joined: Dec 03, 2003
Posts: 286
Hi,
Could you please verify this assertion:
A multiple-threaded object instance, ObjectName, has a static method
which does non-trivial transformations on other class
(static) variables (including types long, float, and double):
therefore, assuming the Java keyword "volatile" is not used, either
1. The static, multi-threaded method must be synchronized, or
2. At critical points within the static multi-threaded method,
it must synchronize a code block on ObjectName.class, or
3. The invocation of the static method must be synchronized
on ObjectName.class like this:



Since writing the above, I've verified it to be true through text books. The
one thing I did not know, however, was that the volatile keyword can and should
also be applied to all primitive types: boolean, short, int, as well as the
longer types: float, long, and double.
What I have not done yet is a test. Does a simple synchronization of a method
"cost" in speed from 5 to 10 percent? Probably. Then, how much of a cost in
speed is the use of volatile, if anything significant? I might eventually to
a simple speed test on this.

Thanks,
Javini Javono
Hi (again),
This posting discusses every aspect of thread safety
that I can think at this time that relates to this
project. I'm reviewing this topic as I am about to
go into the implementation and re-implementation stages
for my RMI-based server-side software. I'll be using
this post as a condensed resource, assuming that the information
is accurate, since I find that sometimes these threading issues
slip my mind. Of course, I appreciate any comments,
corrections, and expansions.
Assumptions:
Some assumptions change in different sections of this
posting.
The following assumptions, are constant throughout:
1. There is one database random access file.
2. Only RMI connections are used in these examples.
3. The client has access to a remote Data object which,
for simplicity, is simply called Data (instead of DataImpl).
Data does not directly manipulate the file, but instead
calls MicroData which carries out very low-level file
manipulations. Every method of MicroData is synchronized.
4. There is a separate lock manager object called Guard which
has three methods: lock(recordNumber), unlock(recordNumber),
and isLocked(recordNumber).
5. The Client defines business methods, such as the following:
guard.lock(100);
fileStuff = data.read(100);
fileStuff = someProcess(fileStuff);
data.write(fileStuff, 100);
guard.unlock(100);

The invocation and usage structure looks something like this:

Both Data and Guard are remote objects, and the client can invoke
methods on both objects.
Part 1: General Considerations
-------------------------------
Let's first discuss what it means to be "threadsafe." I suspect
that this has different interpretations in different environments.
This stuff I always find tricky, so don't be surprised if I make
an erroneous statement here and there.
For this theoretical discussion, we will be writing a Data class
which uses the following class, Guard:

For this example, there will be one instance of Data and it will
be multi-threaded.
The following simple rules can be applied to make Data threadsafe:
1. Do not declare any static class variables.
2. Do not declare any instance variables.
3. Only use method variables.
These are simple guidelines, and there are more advanced rules not
mentioned above.
Following the above simple rules, we generate the following threadsafe
code for Data:

Data is a threadsafe class due to the following:
1. Each thread gets its own unique inputValue primitive.
2. Each thread gets its own copy of value from the stack.
3. Any potentially shared variables that are class or instance
variables are final and unchangeable, or are immutable and
unchangeable. Here I am assuming that a well designed and coded
immutable object is threadsafe, and I am assuming that the String
class is threadsafe, even though its JavaDoc does not touch upon
this topic.
4. The return value in the declaration, the "long", belongs uniquely
to each thread.
5. The code does not attempt to share one instance of object Guard,
but instead uses a new instance, and this new instance of Guard
is used exclusively by one thread.
6. The invocation of the static method Guard.incrementInput()
would not be thread safe if this class method were not declared
synchronized. For there only exists one such static method, and,
obviously, if process() is multi-threaded, then multiple threads
could simultaneously be passing through this method.
7. The invocation of the static method Guard.incrementUnsyncInput()
is unsynchronized, and thus not threadsafe. So, within the process()
method, the Guard class is synchronized on and then the invocation
of the unthreadsafe method occurs.
8. Because Data is multithreaded, then its instance member, guardInstance,
is shared by all the threads using Data. To use this shared, instance object
in a multithreaded environment, it must be synchronized, and that is
what occurs in the above code.
9. mutableObject is passed in as an input argument to the method; it is,
of course, a reference to one object. If the invoking method was calling
our method like this:

then we would be assured that each thread passing through the process()
method would have its own, unique mutableObject and operating on it
would be threadsafe. However, for this example, we are assuming
that the invoking method looks more like this:

Thus, mutableObject has only one instance, and so in the method
process(), which receives mutablObject as a reference, that one
instance of mutableObject must be synchronized before mutating it.
Now of course, if we pretend that this mutable object was a Vector
which is coded to be threadsafe, then we would have no need to
synchronize it since it has already been designed to safely handle
multiple threads (this is usually done by simply synchronizing all
the methods of the class).
The code is threadsafe, in short, because it is coded so as not to use
anything which might be shared between two or more threads, but when it
does go beyond these elementary rules and does use shared objects, it
ensures that their use is safe through the use of synchronization.
If you reasonably refactor your design so that your multi-threaded classes
follow the simple rules given above, then you minimize or have no need
whatsoever for the synchronized keyword in your code; this makes the code
easier to understand, and lessons the probability that you will accidentally
dead-lock yourself, or that someone coming along after you to maintain the
code will make a seemingly harmless change, here and there, only to find that
the code dead-locks.
Part 2: Combinations
---------------------

We will assume always that on the server side there is only one
instance of Guard and that it is multi-threaded.
We will consider the following assumptions about Data in turn:

Case 1
------
Client(1) --> single-threaded Data(1)
Designing and coding Data is the simplest because it is not multi-threaded.
Even when you have two clients, each with its own Data object, Data is still
single-threaded and still easier to design and code:
Client#1 --> single-threaded Data instance number 1
Client#2 --> single-threaded Data instance number 2
Case 2
------
Client(1) --> multi-threaded Data(1)
This is unusual, but would occur if you allowed the clients to create
multile threads and send them through Data. An example might be a
shopping cart containing 10 hotel reservations; the client then
attempts to book 10 reservations simultaneously by sending in 10
threads to use Data at once. Designing and coding data
is harder since it must be threadsafe.
Case 3
------
Client(N) --> multi-threaded Data(1)
Designing and coding data is harder since it must be threadsafe.
Case 4
------
Client(N) --> multi-threaded Data(M)
This is perhaps clearer if we restate it like this:

we see that it is not that remarkable in nature. For instance, what
we are doing on the server side in our factory is this: we are in
advance, saying that we will define not more than 3 Data instances;
and as we accept new clients, we share these three instances among the
N clients; thus, each instance of data is potentially multi-threaded.
We allow there to be as many Data instances as might be required depending
on the load of the system and how much memory the server has.
Case 4 is simply not required for the exam of course. But, it brings
up an interesting question: is RMI a toy? Or is it a real, potential
server-based process? One way to achieve scale would be to have a
factory decide how many Data instances are needed and how many times
each instance should be multi-threaded dynamically, just like a servlet
container functions.
Part 3: Forces Shared Resource: Guard
---------------------------------------
Regardless of how you design your server-side system, that is, whether
or not Data is single-threaded or multi-threaded, you are forced, by
the requirements of the project, to share the lock manager, which I will
call class Guard.
Now, you could synchronize every method of Data, and then it could be
multi-threaded and also be threadsafe as long as only one instance of Data
ever existed on the server; but, this loses concurrency, and is not considered
acceptable.
Therefore, you are forced to implement a solution using the Java synchronized
keyword, using wait(), notify(), notifyAll() (in some subset or combination),
and to design a locking mechanism of one type or another.
Part 4: Business Methods
-------------------------
Let it be assumed that Client contains business methods. If the business
method is a compound operation relying on the fact that a previous read
has not changed before an update or write is made to the same record, then
it is obvious that the following construct within the business method
is required:

However, what about a business method which simply reads in a given record?
There are two choices: we read in the record and don't use the locking and
unlocking mechanisms, or we use the locking mechanism before reading the
record.
For this example, let's assume another business method which deletes
a record from the database:

Now, at the same instant:
Some business process decides that record 100 should be deleted.
Client 1 decides to read record 100 to see what is in it.
Let's discover the ramifications concerning Client 1 when the Guard
is used and when the Guard is not used.
Let's begin assuming that the Guard is used. Then the reading
business method would look something like this:

Now, what exactly have we gained by using the Guard for this read?
The record 100 either exists or it doesn't exist. Whether record 100
exists is like atomic theory (sort of): it's uncertain. Even if
our business read method uses Guard, record 100's existance cannot
be determined. Using Guard doesn't guarantee the "correct" answer,
for if the business read method gets the lock first, then the
correct answer is that record 100 exists. But if the delete business
method gets the lock first, then the equally correct answer is that
record 100 no longer exists.
Locking the record 100 does not assist us in any way. Therefore,
I think we can safely say the following compound hypothesis:
1. If any record in the database is never left by any sub-step
of a business process in an inconsistent state, then
2. It is perfectly acceptable for the business read method not
to attempt to lock the record it is reading.
Now, the above is my opinion, I should add: that is, the qualification
at point 1 is something I personally consider important (and I believe
that other people might consider it not important).
Now the question becomes, do there exist any business methods which in
any way mutate a record such that during any sub-steps that record is
in an inconsistent state?
Basically, this becomes a contract which must be enforced through the
code. Thus, every method in Data must carry out enough steps such
that when this method is exited, the record is never in an inconsistent
state. For example, it is conceivable that a method in Data might,
under some hypoethetical circumstance, need to be synchronized to ensure
that sub-steps within the method never leave the record in an inconsistent
state. If this can be enforced, we are free to not have our reads
use the Guard object.
Part 5: Reads Free of Guard Use and Implications on Guard's Dynamic Structure
------------------------------------------------------------------------------
Once we decide that business reads of records do not require Guard's lock()
and unlock(), then this means that guard must at the minimum have one mutex
for any record which is being mutated in some way by a business operation.
Here are some concerns about the Guard class:
1. How much memory is available for its use; how large will the Guard object become.
2. How much synchronization is required for the Guard class to function. The more
synchronization, then the more contention, that is, the more time processing
power is spent locking and unlocking monitors for numerous contending threads.
However, once we decide that business reads will not use the Guard class,
then that means that far fewer threads and far fewer records will be using
the Guard class. This suggests that we do not need to be overly concerned
about contention. However, if there is a busy day at the office, and
many bookings are being made, and since each booking is a record mutation
operation, then each record booked is, at some point, used within the Guard
class, which can, under adverse circumstances, eventually consume too much
memory.
[Aside: I should remind you, that for this project, these are not real issues.
But, the issues are interesting, so that is why I am studying them.]
In short, the problem is this: once we book a record, then it is highly
improbable (but not necessarily impossible), that the record will be mutated
again. Thus, how do we remove this record from the Guard? The problem exists
because when you use an unsynchronized Guard design, and that design is being
multi-threaded, you can't safely grow or shrink the collection, whether this
collection be an ArrayList or a HashMap or a WeakHashMap.
If you look at some of the algorithms, you may find, though I have not carried
out this study myself, that using a synchronized collection does not introduce
any more or any more significant contention than an unsyncrhonized collection.
In this sceneario, where a synchronized collection is used, it would be safe
to shrink the Guard's size either instantly or on a periodic basis perhaps
performed by a background thread.
Another approach is to use Phil's algorithm which consists of the following
constructs for the Guard object (and, I summarize, so please see Phil's
article):
1. The Guard object uses a HashMap.
2. Each item in the hash table is a MUTEX.
3. The particular form of the MUTEX is what I call a "linked-list mutex"
wherein the first thread in, is the first thread to gain access.
4. And, Phil may not reallly be using a MUTEX (mutually exclusive lock);
while I don't know, it's possible he uses a type of lock which might
be termed "mutually excluded writes" and "unexcluded reads".
Phil's algorithm--so Phil asserts and I believe, though I personally have
not worked it out myself--allows the unsynchronized HashMap to dynamically
grow and shrink in size. [Aside: again, an interesting question is to study
whether a synchronized HashMap would add any significant contention issues.]
Notice, by the way, that if we sent every business request through the Guard,
including simple business reads, the Guard, regardless of how it would be
implemented, would always hold mutexes for each record in the file, since we
can assume that given enough users, the complete file is always being read
for searches. So, this whole discussion has but little relevance unless
one intends to use business reads not requiring the use of the Guard class.
Part 6: Multi-Threaded Data and Guard
--------------------------------------
For every database file, and we only have one for our project, there must
exist only one, unique Guard.
So, if each Data is single-threaded, and there certainly will be more than
one Data instance, Guard can be an instantiated object with instance methods;
each Data instance would have a reference to the same Guard object.
Of course, Guard can always be a non-instantiated class having only static
methods.
If Data is multi-threaded, and there is only one Data instantiated,
Guard can be an instantiated object with instance methods; each Data instance
would have a reference to the same Guard object. Of course, Guard
can always be a non-instantiated class having only static methods.
If Data is multi-threaded, and there can be more than one Data instantiated,
Guard can be an instantiated object with instance methods; each Data instance
would have a reference to the same Guard object. Of course, Guard
can always be a non-instantiated class having only static methods.
The point is, then, that Guard is meant to be a loner, to be associated
with only one database file, and that Guard is meant to be multi-threaded.
Even though Guard is a shared object, by Guard's very nature of being
a thread resource allocator concerning records, usually no special handling
of Guard itself is required.
That is, the following line of code works equally as well in a single-threaded
Data as a multi-threaded Data:

Of course, we have not yet said that Data would contain business methods within
it, but the above code represents a multi-threaded business method somewhere
(either on the server or on the client, depending on how you set up your
design); for me to say that the above example is in Data is not how my design
will be.
Nevertheless, given some multi-threaded object somewhere that contains
business methods, simply calling guard.lock(500) is not a multi-threading
issue because it is thread safe.
However, within a multi-threaded business method, the following complete
business method may not be thread-safe:

or perhaps the assignment of the record number itself is sufficient to make
the multi-threaded business method unthread-safe:

Thus, within a multi-threaded business method, Guard and its methods are
threadsafe, but the surrounding code may be full of traps for the
unwary. [Which is why I'm reviewing this important material.]
If treading in a multi-threaded environment makes you feel uncertain,
then it is recommended, obviously, that you use a factor to deal out
exactly one instance of Data to each client. No one person, for one
second will question that single-threaded code is easier to understand,
modify, and verify than multi-threaded code. I may very well go this
route myself.
An interesting question Phil has contended with is this: what if I
design my system so that it can work equally as well on more than
one database file. Based upon a posting Phil made, I speculate
that Phil's design only has one Guard instance, even when there are
multiple database files. There probably is a very good reason Phil
settled on this design (since he is very sharp). So, let's investigate
this and see why the simpler design, where one instance of Guard was
not created for each, different database.
We have already determined that Guard need not be a non-instantiated
class with only static methods. Thus, we can certainly instantiate
one Guard object for each, different database. By the way, it may
make more sense to start referring to these databases as database
tables.
If we do this, then, of course, the Guard object does not need to
have coding logic to account for different database tables.
Let's give it a walk-through. The multi-threaded or single-threaded
business method looks like this:

In conclusion, it is unclear why any coding logic for two database
tables would exist within the Guard object. Unless that logic was
relatively small, and dealt with the organization of using the
correct Guard with the correct, underlying database.
Again, our projects need not deal with two databases or two database
tables.
Thanks,
Javini Javono
[ February 12, 2004: Message edited by: Javini Javono ]
 
 
subject: Threads 002