Helen Ma

Ranch Hand
+ Follow
since Nov 01, 2011
Merit badge: grant badges
For More
Cows and Likes
Cows
Total received
In last 30 days
0
Forums and Threads

Recent posts by Helen Ma

I did an experiment. Instead of calling the seek and readFully, I print out the locationInFile.
I run the code in the synchronized block just like what the retrieveDvd method using 100+ threads. The locationInFile are output correctly.
This proves that my hardware moves the file pointer to a wrong location in file due to its performance issue.
Hi, thanks for your reply.

To be more specific, in the above code, there is a FileAccess class. This FileAccess is static:


My threads in the test class use the same instance , f to retrieve the records in the mutex.
I checked the RandomAccessFile seek method. This is a native method.
I suspect :
when 100+ threads are using the same f to retrieve records, in theory, the mutex works. Since the seek method is controlled by the hardware, my hardware is not good enough to move the file pointer to the right location on the data file and ends up with EOF exception or reading a wrong record.
I am pretty sure it has something to do with my hardware performance problem.
I did an experiment with a large number of threads:



Case 1:
In the Test1 class, the threads call join method. The output shows me something like this:
T0: (record1), (record2), (record3) ....where record# is the data in each record.
T1: (record1), (record2), (record3) .....
T2: (record1), (record2), (record3).....


Case 2:
Without the join method. One of the possible output is:
T7 (record1), (record2) .....
T9: (record2), (record2), (record3)....
T4: EOF Exception....

IMHO,the file pointer cannot catch up while CPU switches from T7 to T9 and T9 ends up reading an unexpected location.
The same issue may happen for T4 too.

But in the real application, I can't use join to control the order of the threads' execution.

Hi, Andrew. Thanks for your advise.
I like your book's approach :




I have been doing something similar. I found out that the file pointer cannot move fast enough to the correct location in file when the CPU switches from one thread to another as concurrent reading is being executed. It ends up reading wrong data.
Nothing wrong with your concurrency design, but it is just the file pointer cannot catch up with the a large number of threads which are doing concurrent readings.

For example, I have 200 threads doing concurrent reading, and one thread can access the file at a time
1. Thread 1 reads the last record in the file. The file pointer is at the end of file.
2. CPU switches to Thread 2. Thread2 tries to read the first record. At this moment, only Thread2 is doing reading from the file. No other threads are accessing the file due to the synchronized block. The file pointer is supposed to move to the location and read fully. But the file pointer cannot move fast enough and ends up reading the EOF.

This is my opinion explaining why I get EOF exception when dealing with large amount of threads.
I am sure in the exam, there is not that many clients trying to read the file at the same time.

Thus private method of inner class Color is visible throughout class Main.



Hi, I don't see any private method of inner class Color in the example. Are you talking about the GREEN object? GREEN anonymous object has a public method, not private method. The private method belongs to Color, not the anonymous object.
To my understanding, GREEN anonymous object can see the private method, and it has another public method. The public method is not overridden. In the Main method, Color.GREEN.method() is actually refering to the private method.
I have the same experience with another rancher too.
12 years ago
I got my Bachelor and Master in computer science. During that time, multiple threading / parallel / distributed computing have been taught a lot in my school.
As my professor said every operating system has deadlock problem. I am not sure if this is still true nowadays.

I've working for small and big companies as a software engineer for 6 years in the US.
At work, I have been dealing with multiple threading issues that cause a lot of problems in software.
I realize handling multiple threading properly is never easy.

Another issue about the post is that I feel comfortable to limit my discussion on some general issues or Monkhouse's book. I am not supposed to post too much about my own work.
I can only show part of my experiments , but not complete code. (My professor joked : if someone wants to copy your homework, just give them some wrong answers to copy!")

But anyway, I tried something like this, similar to Monkhouse's DVD example:
When I run my test with a large amount of threads, it still gives me EOF exception.
According to Monkhouse book, multiple threads can access the retrieveRecord(), but they can only access retrieveRecords(long i) method mutual exclusively in order to read the record in the file correctly.
This is a very nice design from my perspective. However, it still throws me EOF exception as the file pointer cannot point to the right record when the CPU switch from one thread to another.




I think talking about threading in text is more confusing than talking about it face to face.
If I don't understand thread topics much, I cannot handle the OCJP or even pass the Operating System class in my computer science class.
(I remember Operating System class deals with deadlock, starvation, mutual exclusion a lot.)
If I don't understand much about the Monkhouse's example, I won't come up with questions about this.

By anyway, I think the following code is equivalent:


Case 1:

Output:


Case 2:

Output:


I am a little bit confused and here is my idea:
In case1, the GREEN refers to an annoymous, Color's subtype class. GREEN overrides the public method.
In case2, the GREEN class does not override the private method in Color because private method cannot be overriden. GREEN has it own public method.
However, in the main method, GREEN.method() actually refers to the private method because GREEN is a Color and GREEN can "see" the private method.
Yes. Record length is the whole lenght of a record that includes all fields as specified.
So, the code become this:

In Monkhouse's book, its algorithm for retrieving DVD is like this:


So, you mean, instead of using readLock to support concurrent reading, use this algorithm?
Yes. Thanks for pointing out. The method should be retrieveRecords(i) instead of retrieveRecord(i). The RECORD_LENGTH is the number of bytes of a record, which is a constant somewhere in the class.
To my understanding synchronized (file) works the same as lock() and unlock() for retrieveRecords(i). Monkhouse book mention that we can transform synchronized block into lock/unlock methods.
So, in this case, it will become this?
We have a synchronized block enclosing a lock/unlock critical section?
Do we need both of them?

Here is what I am doing experiment with , not the actual work:
This is another experiment using Monkhouse's approach:

I used read lock to lock the block in retrieveRecords, so that multiple reader threads can read the block of code concurrently.
When it comes to reading a particular location in the file, it synchronized the file, so that the seek and readFully is only accessed by one thread.

This approach performs better. But in the test, when I test with a large number of threads (eg 200 threads), the test returns EOF exception or read the wrong data.
I think :
1. Why EOF exception? When a first thread reaches EOF, a second thread tries to continue reading the file for a particular record. The code is properly locked and synchronized. But problem still occurs. It may be because the file pointer of the RandomAccessFile cannot move fast from EOF to where the particular record when the CPU switches to the second thread. So, when the second thread executes, the file pointer still points to EOF.

2. Why sometimes a wrong set of data is read? The idea may be the same as above. The first thread finishes reading a record , say record 3. The CPU switches to a second thread which is about to read another record, say record 20. However, the file pointer cannot move fast from record 3 to 20. So, the second thread ends up reading record 4 which is right next to record 3.




Here is what I am doing experiment with , not the actual work:

As suggested by Roel, I synchronized the whole block of code in retrieveRecords, so that one thread can read the file while other threads cannot access that block.
It works fine despite this seems slower as no concurrent reading is allowed here.
But based on Monkhouse, performance is not required in the exam.






Roel De Nijs wrote:

Helen Ma wrote:Do we need to create multiple threads to read/write the db file in standalone mode?


No.



Hi, Roel. Thanks for this reply. I can do the work in a similar way as Monkhouse's approach for the standalone mode, just one client's GUI, doing reading/writing, no other threadings doing concurrent reading/writing.

So, Roberto's Data test involves threads doing concurrently updating/findings... I am sure this test is simulating RMI multiple client requests.