File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes NX: cacheless design to keep things simple? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark "NX: cacheless design to keep things simple?" Watch "NX: cacheless design to keep things simple?" New topic
Author

NX: cacheless design to keep things simple?

Jacques Bosch
Ranch Hand

Joined: Dec 18, 2003
Posts: 319
Hi guys.
Here is another question I have.
Instructions say:

A clear design, such as will be readily understood by junior programmers, will be preferred to a complex one, even if the complex one is a little more efficient.

Because of this, I decided to implement my Data class without a data cache to keep the design simple. (And less efficient as a by product );
But the instructions also say:

Your server must be capable of handling multiple concurrent requests, and as part of this capability, must provide locking functionality as specified in the interface provided above.

Now, my problem is that I can't think of a way to handle multiple concurrent requests (really) without using an in-memory data cache.
Let me explain:
Reads and writes on my raf obviously have to be synchronized to prevent data corruption. But if I don't use a cache, writing and reading directly from the raf, only one client can ever perform a read/write at any one time, because of the synchronization on the raf. Even if two clients are updating two different records, one will have to wait for the other to finish.
Where as it would seem desirable that two different records should be simultaneously updatable by two separate clients. (I.e. Only if the two clients try to update or read the same record at the same time should the locking mechanism make the one wait). Otherwise, it seems to me, that the server cannot really *handle multiple concurrent requests*.
Is this correct?
Is there something I'm missing. Is there away I can give it the capability of *handling multiple concurrent requests* without using a cache?
Else I'll have to use a cashed, so that two different *cached* records should be simultaneously updatable by two separate clients. And then it's the cache's responsibility to queue the updates, and send them to the raf synchronously.
But I'll prefer the "cacheless" approach if viable.
Phil, Andrew, anybody, some thoughts would be greatly appreciated.


Jacques<br />*******<br />MCP, SCJP, SCJD, SCWCD
George Marinkovich
Ranch Hand

Joined: Apr 15, 2003
Posts: 619
Hi Jacques,
Originally posted by Jacques Bosch:
Reads and writes on my raf obviously have to be synchronized to prevent data corruption. But if I don't use a cache, writing and reading directly from the raf, only one client can ever perform a read/write at any one time, because of the synchronization on the raf. Even if two clients are updating two different records, one will have to wait for the other to finish.

I think what you say here is true. I don't see an alternative to this if you are cacheless. If there's a single raf, then you only want one operation involving the raf at any one time. Otherwise, you run the risk of corrupting the database or reading from the wrong location, etc.
What you are discussing is the need for database access synchronization. But the need for locking is broader than that. There is a need to treat multiple database operations as a single operation. Think of the steps involved in booking a record. You need to read the record (to make sure you're seeing the latest values for the record) and then you want to update the record with new values. Moreover, you want these two things to happen in isolation, that is, without interference from any other database operations. Synchronization of the raf will guarantee that your read won't interfere with your update. That is, first your read will execute, and when its finished, then your update will execute.
But, raf synchronization does not guarantee that another client will not update your record after you have read it and before you have had the chance to update it. For this, you need a record locking mechanism. If you lock the record, read the record, update the record, and then unlock the record, then the scenario in the first sentence of this paragraph cannot occur. From the point of obtaining the lock on the record until you unlock the record, you have exclusive hold on the record in question from all other clients. Of course, this means that the database operations have to be written to enforce the locking mechanism.
Hope this helps,
George
[ January 10, 2004: Message edited by: George Marinkovich ]

Regards, George
SCJP, SCJD, SCWCD, SCBCD
Jim DiCesare
Greenhorn

Joined: Nov 23, 2003
Posts: 14
How do you solve this problem of "am I seeing the most recent data when I update?". Would this mean that you have to pass the entire record your seeing, and what changes your making...all the way to the data access class which will make sure that what you were looking at is the most recent, and if not throw an exception saying something like "data has changed do you want to view it before you update?"
George Marinkovich
Ranch Hand

Joined: Apr 15, 2003
Posts: 619
Hi Jim,
Originally posted by George Marinkovich:
For this, you need a record locking mechanism. If you [1] lock the record, [2] read the record, [3] update the record, and then [4] unlock the record, then the scenario in the first sentence of this paragraph cannot occur. From the point of obtaining the lock on the record until you unlock the record, you have exclusive hold on the record in question from all other clients.

Expanding on the above:
When you are at step 2, you have already obtained the lock for your record of interest in step 1. Assuming your database operations have been written to respect locking, then you know at step 2 that you are the only client allowed to change the record. If no one, other than you, can change the record at this point, then reading the record now yields the latest values for the record. Because you are reading in a locked context you know that no one else can possibly change the values of the record as long as you maintain your lock. So, step 2 returns the latest values for a particular record.
Somehow you allow the user to modify the record and then in step 3 you update the record. Again, you are the only client authorized to do an update (at this time) because you are the only client holding the lock. In step 3 you relinquish your lock and then your guarantee of exclusivity ends.
I do not broadcast changes in the database to my clients. So, yes a client could look at a record in his JTable that does not appear to be booked (there is currently no customer number assigned). The user presses a button to book that record and a customer ID editor window appears that allows the user to enter the customer ID. But when he does this, instead of the blank customer ID field the user was expecting (based on how it looked in the JTable), the user now sees that there is already a customer ID in the field. What happened?
The user was looking at a stale record in the JTable. When he tried to book it, the record was refreshed (as explained above) and now he sees that at sometime after he last refreshed the screen some other client has booked the record he wants. So the user can decide at this point whether he still wants to book the record (thereby overwriting someone else's booking, which under some circumstances may be the appropriate thing to do), or cancel the booking operation because he doesn't want to book the record now that someone else has already done so (most of the time this is probably the proper course).
This scheme is based on a lot of assumptions about what is required by the assignment. It's definitely not the only way to do locking. It may not even be the best way to do locking. It's just the way that makes sense to me. It's taken me an embarrassing amount of time to reach my current understanding. I thought I understood locking many times, but often my understanding was erroneous or at least incomplete (maybe it still is ). I think there was a lot of confusion in my mind between sequentializing concurrent access to the database file, and protecting multiple database operations from a particular client from interruption by other clients. The first might be called an integrity concern while the second is more a concern for atomicity (that is, wanting to treat multiple database operations as if they were single database operation). In my opinion both of these concerns need to be implemented in the solution. How they should be addressed is, of course, up to you.
Hope this helps,
George
[ January 10, 2004: Message edited by: George Marinkovich ]
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11476
    
  94

Hi Jacques,
Now, my problem is that I can't think of a way to handle multiple concurrent requests (really) without using an in-memory data cache.

I think it depends on what you define as a concurrent request.
When two clients want to read a record, they expect to get back an array of Strings. But your RAF access will be working in bytes. So obviously there is far more to that request than just the RAF access. Think of how many steps are involved in that "read" operation (record number validation, physical read, breaking into fields, converting to Strings ....). Now think about how many of those steps can be done concurrently.
The same is true with the other operations. The update has a few more steps, many of which can be performed concurrently. And the find is an even more obvious example - the find is performing multiple reads and doing comparisons on the data read. So pretty much all the work in a find can be done concurrently.
Even with a cache, you are still going to have to write the data back to the file at some point. Unless you create a seperate thread to handle writes, and just queue them, you are still going to have to drop back to a synchronized block at some point.
Regards, Andrew


The Sun Certified Java Developer Exam with J2SE 5: paper version from Amazon, PDF from Apress, Online reference: Books 24x7 Personal blog
Jacques Bosch
Ranch Hand

Joined: Dec 18, 2003
Posts: 319
Hi there Andrew.
Thanx.

Even with a cache, you are still going to have to write the data back to the file at some point. Unless you create a seperate thread to handle writes, and just queue them, you are still going to have to drop back to a synchronized block at some point.

Well, for the cache design, the separate thread writing a queue of updates/creates/deletes to the raf is what I've had in my head.
Non-cache:

... (record number validation, physical read, breaking into fields, converting to Strings ....). Now think about how many of those steps can be done concurrently.

It's early in the morning, and I think I'm a bit slow , but how do I go about doing everything, besides the raf access, concurrently from within my singleton Data object? Since methods doing things like "breaking into fields, converting to Strings" are private, and called from within the synchronized read/update/find/etc, for all intents and purposes, will be synchronized as well.
So with what you are saying it sounds like those methods should be in the separate threads serving each client for them to be concurrent.
I'm probably missing something small and obvious. (Or big and obvious).
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11476
    
  94

Hi Jacques,
It's early in the morning, and I think I'm a bit slow , but how do I go about doing everything, besides the raf access, concurrently from within my singleton Data object?

Just being a singleton does not stop multiple threads operating on it

You should be able to see that the multiple threads are running concurrently, but they do not interrupt each other within the synchronized block (although the synchronized block may be interrupted, only one thread can run it at any given time).
Regards, Andrew
Tony Collins
Ranch Hand

Joined: Jul 03, 2003
Posts: 435
Why don't you sync on the cache whilst updating ?
Or if not using a cache use filechannels instead of raf.
Tony
Jacques Bosch
Ranch Hand

Joined: Dec 18, 2003
Posts: 319
Hi Andrew.

Just being a singleton does not stop multiple threads operating on it

I know that. But then the calls can interfere with each other.
I.e. In the method you used:

2 threads call this at virtually the same time. So, since it isn't synchronized, what prevents the two calls from messing with one another's variable values?
Jacques Bosch
Ranch Hand

Joined: Dec 18, 2003
Posts: 319
Tony

Why don't you sync on the cache whilst updating ?

Trying not to use a cache as explained at the top of this post.

Or if not using a cache use filechannels instead of raf.

Why is that better? (Sad to say) I no nothing of filechannels. Never used them before.
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11476
    
  94

Hi Jacques,
Originally posted by Andrew Monkhouse:
Just being a singleton does not stop multiple threads operating on it
Originally posted by Jacques Bosch:
I know that. But then the calls can interfere with each other.
I.e. In the method you used:

2 threads call this at virtually the same time. So, since it isn't synchronized, what prevents the two calls from messing with one another's variable values?

I deliberately wrote the doSomething() method to be thread safe. It does not use any instance variables, so it is quite safe for two threads to run it simultaneously.
Here's a minor modification to that method to put in a few method variables which should not be affected by multiple threads:

When I ran this, I got the following output:

I can see that thread 0 started first, but thread 1 finished first. So if there was going to be any corruption then it would be visible. But that is not the case - I can see that the method variable is still set to it's original value, and the List contains only the single item that I expect.
However, if I was doing anything that was not thread safe (modifying an instance variable or a static variable) then I would have to take steps to make the method thread safe.
For a method like your find() method, you may find that you do not need to use any instance variables or static variables. You may be able to build the entire method using method variables. In which case you should be able to run multiple threads on that method simultaneously.
Regards, Andrew
Jacques Bosch
Ranch Hand

Joined: Dec 18, 2003
Posts: 319
Well, Andrew. Would you believe it: You have help me greatly by clearing that up for me.
It's so obvious, but I completely for got that synchronization problems only come into being when instance or static variables are in play, and not when only local variables are being used.
I kept thinking the local variables' values are going to be messed up if more than one thread executed a method on the same instance of a class.
THANX MUCH!
So, back to my cacheless design:
If I synch on the raf access only, and do everything else in unsynched, but synch-safe code, more concurrent processing will be achieved, only waiting when actual reads/writes are done on the raf.
Right?
Philippe Maquet
Bartender

Joined: Jun 02, 2003
Posts: 1872
Hi Jacques,
Well, Andrew. Would you believe it: You have help me greatly by clearing that up for me.

I can believe it. It's a quite common feeling for anyone reading Andrew's posts...
So, back to my cacheless design:
If I synch on the raf access only, and do everything else in unsynched, but synch-safe code, more concurrent processing will be achieved, only waiting when actual reads/writes are done on the raf.
Right?

Yes ! Here are examples of operations which could run concurrently in your design :
  • record conversion from array of String to byte array, both ways
  • in findByCriteria(), testing if a record matches the search criteria


  • Regards,
    Phil.
    Jacques Bosch
    Ranch Hand

    Joined: Dec 18, 2003
    Posts: 319
    Hi PHil. (Nice weekend?)
    Thanx. I am busy making the changes now.
    Philippe Maquet
    Bartender

    Joined: Jun 02, 2003
    Posts: 1872
    Hi Jacques,
    Hi Phil. (Nice weekend?)

    Nice weekend ? I practiced English outloud nearly full time, till I felt my mouth was close to get paralysed !
    Best,
    Phil.
    PS: Let me know when you get Max's book. It should be by no later than Thursday.
    Jacques Bosch
    Ranch Hand

    Joined: Dec 18, 2003
    Posts: 319
    Phil, in another post you said:

    ...or even better (and much simpler), use a FileChannel got from your RAF : FileChannel is thread-safe and allows concurrent access through the methods which take an explicit position.

    So, using a FileChannel, I don't have to explicitly synchronized on my raf every time I do a read or write like I'm doing now? And it allows concurrent reads?
    Philippe Maquet
    Bartender

    Joined: Jun 02, 2003
    Posts: 1872
    Hi Jacques,
    So, using a FileChannel, I don't have to explicitly synchronized on my raf every time I do a read or write like I'm doing now? And it allows concurrent reads?

    You should have a look at the FileChannel doc. Here is an interesting excerpt in the context of your question :

    File channels are safe for use by multiple concurrent threads. (...) Only one operation that involves the channel's position or can change its file's size may be in progress at any given time; attempts to initiate a second such operation while the first is still in progress will block until the first operation completes. Other operations, in particular those that take an explicit position, may proceed concurrently; whether they in fact do so is dependent upon the underlying implementation and is therefore unspecified.

    As I understand that, it means :
  • Effective concurrency will be performed only when methods take an explicit position (a "long position" as parameter)
  • Even in that case, it's not garanteed ("dependent upon the underlying implementation"). But this is not an issue : after all we don't *need* the garantee.


  • Now I use a sort of synchronization anyway, to allow concurrent reads (from the file or my partial cache) and exclusive writes.
    Jacques, concurrency through FileChannels is a complex subject. I remember an interesting discussion mainly between Jim and Max about it, I think in this thread. You could read it for further information.
    Best,
    Phil.
    [ January 12, 2004: Message edited by: Philippe Maquet ]
    Jacques Bosch
    Ranch Hand

    Joined: Dec 18, 2003
    Posts: 319
    Thank you Phil.
    I'll check out the thread you gave.
    I have read the api docs on FileChannel and ByteBuffer.
    Looks better than using raf alone.
    To read from a channel, I can use:

    But how do I specify how many bytes to be read. Is that determined by the size of the ByteBuffer? Just like the byte[] array with the raf reads?
    Philippe Maquet
    Bartender

    Joined: Jun 02, 2003
    Posts: 1872
    But how do I specify how many bytes to be read. Is that determined by the size of the ByteBuffer?

    Yes, by the size *remaining* in the ByteBuffer.
    Regards,
    Phil.
    Jacques Bosch
    Ranch Hand

    Joined: Dec 18, 2003
    Posts: 319
    Hi Phil.
    Just read that thread you specified. WOW! It is very long and complex and informative.
    My feeling is now, since I have already implimented my whole data class with a synchronized raf, I'm not going to switch to a file channel now.
     
    I agree. Here's the link: http://aspose.com/file-tools
     
    subject: NX: cacheless design to keep things simple?