aspose file tools*
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes question regarding to retrieveDvd method in Monkhouse book Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark "question regarding to retrieveDvd method in Monkhouse book" Watch "question regarding to retrieveDvd method in Monkhouse book" New topic
Author

question regarding to retrieveDvd method in Monkhouse book

Helen Ma
Ranch Hand

Joined: Nov 01, 2011
Posts: 451
In Chp 5 of Monkhouse book, it recommends in the retrieveDvd method, we should use

Instead of this:


The reason why it uses synchronize keyword , not ReadWriteLock is because of the following possibility:
1. Suppose we have two reading threads invoking retrieveDvd on the same DVDFileAccess instance.
2. thread 1 may have a different locationInFile value than thread2
3. when the 2 threads concurrently accessing the retrieveDvd , thread2 may read a wrong locationInFile which is set by thread1

But in my opinion, it is not possible. Because locationInFile is a method local variable. Each thread has its own copy of locationInFile.
So, I think it is ok to use ReadWriteLock to do concurrent reading.


Let me give a similiar example, suppose we have

If this method is executed concurrently by two threads, will the update of i affected by other thread? I think the answer is no. For example:
1. Thread 1 set i =0;
2. Thread 1 prints the print statement and set i = 1;
3. Thread 2 set i =0 and etc.....
Why i is not 1 in step 3? Because i is a method local variable for thread 2.

Any comments?
Roel De Nijs
Bartender

Joined: Jul 19, 2004
Posts: 5407
    
  13

I'm still wondering if you took the time to study the used API carefully. Clearly you are unfamiliar with the new concurrency API, it makes it harder to understand a code snippet. I'm also not a concurrency guru, but I made the effort to have a look at the ReadWriteLock and noticed within 30 seconds why it's a bad idea to use it for concurrent reading (because it will not be thread-safe).

I'm not going to solve the issue for you, because you want to be a certified developer and that's simply one of the skills of a good developer: understand a given snippet of code (and search for info if you are not familiar with some api), know why 1 solution is better than another one,...
You are correct: each thread has its own locationInFile (local variables are indeed thread-safe). But if thread T1 wants to read record1 and thread T2 wants to read record2, the concurrent reading can (and will) go wrong if you use the code described in your post (although both threads have a different value for locationInFile). The reason why is up to you to discover (and learn).

If you really can't find the reason why, just let us know and I (or someone else) will definitely help you.


SCJA, SCJP (1.4 | 5.0 | 6.0), SCJD
http://www.javaroe.be/
Helen Ma
Ranch Hand

Joined: Nov 01, 2011
Posts: 451
I read the ReentrantReadWriteLock.java as well as a Java reference book about that.
(Different people may come up with different interpretations when they read the API or books)

The issue with ReadWriteLock is that it still supports concurrent reading from different threads. For synchronize read, only one thread can read.

I thought about a real life example. When two users try to reserve a DVD, it should use synchronize keyword to let only one user verify there is a copy of the DVD and rent it.
If readwritelock is used instead, what may happen? Two users see there is a copy of the DVD. The first user rent it. The second user cannot rent it. The second user will ask "There is a copy there, why I cannot rent it?"

Like you said "if T1 reads record1 and T2 reads record2, the concurrent reading can go wrong..."
If T1 reads a record and T2 reads another record , theoretically, it is thread safe. For example, if I rent DVD 1 and you rent DVD 2 at the same time, we are ok. However, if I rent DVD 1 and you rent DVD 1 at the same time and there is only 1 copy available, then it is not thread safe. I guess that is what the Monkhouse book says. If two threads try to seek a DVD based on the UPC, theoretically, it is ok. But practically, the locationInFile variable may be changed, using readLock to do concurrent reading will cause reading the wrong file. As I look at the code, locationInFile itself is a local variable that one thread owns its copy. I am investigating the specific circumstance when locationInFile is change.



Again, concurrent programming is a complicated topic and we all may have to deal with different/specific situations. Maybe, I am looking into too much details.....
Roel De Nijs
Bartender

Joined: Jul 19, 2004
Posts: 5407
    
  13

From the ReadWriteLock interface:
The read lock may be held simultaneously by multiple reader threads, so long as there are no writers. The write lock is exclusive.


So that's the key point: it allows concurrent reading by multiple threads (like you already said ). Keep this in mind!

Reading some bytes from the file is a 2-step operation: first you have to position the file pointer correctly (seek), then you can actually read the bytes you want (readFully). Now it's time to think back about the concurrent reading. What happens if 2 threads access that code simultaneously? Time for some brain exercise
Helen Ma
Ranch Hand

Joined: Nov 01, 2011
Posts: 451
Consider the concurrent reading:
Let me quote the Monkhouse's code:

From the book:

Multiple reads can occur simultaneously without affecting other threads. Therefore, access to the recordNumbers map is a perfect candidate for a ReadWriteLock.



If the position in the file was changed by another thread between lines 203 and 204, we would end up reading from the wrong location in the file.


This is my question:
locationInFile is a method local variable. If thread 1 uses an instance of DVDFileAccess to invoke retrieveDvd and thread 2 uses the same instance to invoke retrieveDvd concurrently, thread 1 will have its own copy of locaitonInFile and so does thread 2.
The book says if line 202 synchronize is replaced by the read lock, that means concurrent reading is allowed.
If that is the case, what may happen is that locationInFile may be changed by another thread.

I am wondering locationInFile is a method local variable that each thread has its own copy, another thread cannot "mess up" each other's locationInFile variable.
Helen Ma
Ranch Hand

Joined: Nov 01, 2011
Posts: 451
Roel De Nijs wrote:From the ReadWriteLock interface:
The read lock may be held simultaneously by multiple reader threads, so long as there are no writers. The write lock is exclusive.


So that's the key point: it allows concurrent reading by multiple threads (like you already said ). Keep this in mind!

Reading some bytes from the file is a 2-step operation: first you have to position the file pointer correctly (seek), then you can actually read the bytes you want (readFully). Now it's time to think back about the concurrent reading. What happens if 2 threads access that code simultaneously? Time for some brain exercise


Let me answer this question:


If this code is accessed by two threads concurrently, thread 1 has its own locationInFile copy as well as thread 2.
Possible consequence:
1. Th 1 seeks its locationInFile
2. Th 2 seeks its locaitonInFile
3. Th1 readsFully
4. Th2 readsFully.
Since they are only doing reading, my interpretation tells me the concurrent reading won't cause reading a wrong location. Will Th 2 read the locationInFile specified by Th 1? I don't think so because Th 2 has no idea what value of locationInFile Th 1 has.
Roel De Nijs
Bartender

Joined: Jul 19, 2004
Posts: 5407
    
  13

Now I see what the problem is that you are struggling with

Helen Ma wrote:I am wondering locationInFile is a method local variable that each thread has its own copy, another thread cannot "mess up" each other's locationInFile variable.

You are correct It's not the value of locationInFile that's messed up by different threads. It's the file pointer of the random access file (variable database) that could be changed by another thread.

This was a very short explanation, because I have to go now. But it should give you more food for thought. And I think you'll understand it now. If you don't, I'll give later on this evening (or tomorrow) a more detailed explanation which would make things really obvious.
Roel De Nijs
Bartender

Joined: Jul 19, 2004
Posts: 5407
    
  13

Helen Ma wrote:my interpretation tells me the concurrent reading won't cause reading a wrong location. Will Th 2 read the locationInFile specified by Th 1? I don't think so because Th 2 has no idea what value of locationInFile Th 1 has.

Both threads are manipulating the same instance variable (database), so imagine this scenario:
a) T1 changes the file pointer to 150 (T1's locationInFile of record 1)
b) thread scheduler moves T1 to waiting state and chooses T2 to run
c) T2 enters lock (T1 has a read lock, so many read locks are allowed) and changes file pointer to 225 (T2's locationInFile of record 2)
d) T2 reads record 2 (due to the read file pointer is now at the beginning of record 3)
e) thread scheduler moves T2 to waiting state and chooses T1 to run
f) T1 continues where it left before it was put in waiting state, so with database.readFully(). Which record will T1 read? And which record did T1 want to read?
Helen Ma
Ranch Hand

Joined: Nov 01, 2011
Posts: 451
Let me answer and interpret this questions in steps based on the above post:
Suppose we have 2 threads:
a) T1 set locationInFile to 150 in getDvd method , execute retrieveDvd and has seek(150) invoked.
b) scheduler put T1 to wait stage and let T2 run
c) T2 set locationInFile to 225 in getDvd method
d) T1 is still waiting. T2 executes the retrieveDvd method, seek (225) and readFully. The read file pointer stops at a location like beginning of the next record , say location 250.
e) scheduler puts T2 to wait and let T1 runs again.
f) T1 continues to execute retrieveDvd and it has database.readFully(input) invoked. But now, the file pointer points to 250, instead of 150. It reads fully from location 250 and it reads the bytes and put it in the input array.


If that is the right logic, that solves my doubt in the past 2 days why synchronize keyword is used for database.seek and readFully.

Thanks.
Roel De Nijs
Bartender

Joined: Jul 19, 2004
Posts: 5407
    
  13

Helen Ma wrote:If that is the right logic,

It definitely is the right logic. Well done! (only small remark: length of a record is 75 in the example, so position of record 3 would be 300 instead of 250 )

And what about this scenario:
a) T1 set locationInFile to 150 in getDvd method , execute retrieveDvd and has seek(150) invoked.
b) scheduler put T1 to wait stage and let T2 run
c) T2 set locationInFile to 225 in getDvd method
d) scheduler put T2 to wait stage and let T1 run

Which records will T1 and T2 read this time?
Helen Ma
Ranch Hand

Joined: Nov 01, 2011
Posts: 451
If the code in retrieveDvd is like this:


In the above given scenario,
a) T1 set locationInFile = 150, execute the seek (150)
b) T1 waits and T2 runs
c) T2 set locationInFile = 225
d) T2 wait and T1 run

What may be possible is :
1. T1 will execute this : database.readFully(input) to read data at offset= 150, assume the length of the record is 75. It is possible that the file pointer stops at 200 and the scheduler makes T1 wait and let T2 run.
2 . Since T2 can also acquire the read lock, T2 execute seek (225). The file pointer points from 200 to 225. T2 finishes readFully and the file pointer points at 300. Scheduler makes T2 wait and let T1 run.
3. When T1 continues, the file pointer starts at 300 , read the rest of the data, and stops at 325.
Result: Reading the data for T1 is expected to be from 150 to 225. But the output is 325, which is wrong.

Another possible is:
1. T1 execute readFully(input) and finishes the reading. The file pointer moves from 150 to 225. Scheduler makes T1 waits and let T2 run.
2. T2 acquires the read lock. T2 execute seek (225). The file pointer moves from 225 to 300. Scheduler makes T2 wait and T1 run.
3. T2 releases the lock and then T1 release the lock.
Result: T1 reads from 150 to 225, which is correct. T2 reads from 225 to 300, which is correct.
Roel De Nijs
Bartender

Joined: Jul 19, 2004
Posts: 5407
    
  13

Helen Ma wrote:What may be possible is :
1. T1 will execute this : database.readFully(input) to read data at offset= 150, assume the length of the record is 75. It is possible that the file pointer stops at 200 and the scheduler makes T1 wait and let T2 run.
2 . Since T2 can also acquire the read lock, T2 execute seek (225). The file pointer points from 200 to 225. T2 finishes readFully and the file pointer points at 300. Scheduler makes T2 wait and let T1 run.
3. When T1 continues, the file pointer starts at 300 , read the rest of the data, and stops at 325.
Result: Reading the data for T1 is expected to be from 150 to 225. But the output is 325, which is wrong.

That's indeed a possible scenario. It depends on the implementation of readFully-method. If it's an atomic operation or if the implementation is thread-safe, then it's not a possibility. Otherwise it could definitely mess things up badly.

Helen Ma wrote:Another possible is:
1. T1 execute readFully(input) and finishes the reading. The file pointer moves from 150 to 225. Scheduler makes T1 waits and let T2 run.
2. T2 acquires the read lock. T2 execute seek (225). The file pointer moves from 225 to 300. Scheduler makes T2 wait and T1 run.
3. T2 releases the lock and then T1 release the lock.
Result: T1 reads from 150 to 225, which is correct. T2 reads from 225 to 300, which is correct.

That's indeed another possibility and that's what you would expect in a flawless working program


I think you now have a good understanding why the book suggests to use a synchronized block instead of ReadWriteLock. Well done!
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: question regarding to retrieveDvd method in Monkhouse book