aspose file tools*
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes Reading entire db into memory and deferred update..bad idea? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark "Reading entire db into memory and deferred update..bad idea?" Watch "Reading entire db into memory and deferred update..bad idea?" New topic
Author

Reading entire db into memory and deferred update..bad idea?

Don Burke
Greenhorn

Joined: Sep 20, 2005
Posts: 14
Hi guys,

since i downloaded my exam on tuesday ive been checking out the posts on this forum - ive already picked up a few things which i'm really thankful for. cheers.

For my server side design i'm considering reading the entire db into a data structure (i'll term this a dataset) when the server boots..clients requesting / modifying records will actually be working with the dataset. The actual updating of the db file will be deferred and done periodically . At which point the entire dataset will be flushed to disk. I like this idea as it will cut down the cost of IO cost considerably when many clients are using the service.

My biggest concern with this design is its reliability. What if the server crashes and bookings are lost? What if the memory becomes corrupted somehow and is then flushed? Is this an acceptable risk?

I tend to think that if i do this i need a recovery solution. Which is adding to the complexity of the design a bit. The periodic flushing of the dataset to disk would take place every 30 seconds or so..

i'd like to know what you all think. To me this is a cleaner solution then reading/writing to file for every client request but i may be wrong, which is why i'd like your honest opinions on how to deal with IO on this assignment.

regards

db
Thomas Bigbee
Ranch Hand

Joined: Nov 29, 2001
Posts: 48
In the days of yore (late 80s - early/mid 90s) most well designed apps tried to get around the concurrent update issue (we both work on the same record "at the same time" and try to save it) by using versioning, usually an update counter or a timestamp and some business logic during the save. Today most applications use a form of Optimistic Locking (we both work on the same record "at the same time", I save first, you try to save and are prevented or are allowed to merge "based upon configruation"), in the aformentioned instance, working upon detached result sets is perfectly acceptable, because versioning is present. With that said, I see no reason to "flush your detached rowset to disk", just implemnent versioning, this could be as simple as knowing your row number (or row position) and all your original values, then upon updating( since you cannot change the schema), check your original values against you current, if no changes are present (let user know and don't save) if changes are present, (lock the record and check the current record against the original). With that said, I'm using versioning, however, I'm reading in records as needed instead of detached rowsets, because that's the way we used to do it when the dinosaurs were roaming the earth.

Hope this answers some of your questions,
Tom
Bodenstab Oliver
Ranch Hand

Joined: Mar 03, 2005
Posts: 47
Hey,

I also use a cache database.
In your approach you should think of crashing, so it is better to change the values in the database file if you change the cache db. So both action must be in the same transaction. With transaction, I mean you must implement a good looking aproach.

Oliver


SCJP
Bodenstab Oliver
Ranch Hand

Joined: Mar 03, 2005
Posts: 47
Sorry,

i mean locking approach, of course.
Juan Rolando Prieur-Reza
Ranch Hand

Joined: Jun 20, 2003
Posts: 236
Originally posted by Thomas Bigbee:
... an update counter or a timestamp and some business logic during the save. Today most applications use a form of Optimistic Locking (we both work on the same record "at the same time", I save first, you try to save and are prevented or are allowed to merge "based upon configruation"), ...


These techniques seem to be addressing concerns that might be beyond the scope of the project requirements. Are transaction processing activities required? (not.) Record-level locking should be achieved as described. A race condition can be avoided by writing to the filesystem within the lifetime of a "lock". (I don't know why we would ever not want to use NIO for this.) If there is no explicit requirement that display-only data in the GUI is still valid, there is no implicit transaction. Data for update should, obviously, not change by others while user makes an update.


Juan Rolando Prieur-Reza, M.S., LSSBB, SCEA, SCBCD, SCWCD, SCJP/1.6, IBM OOAD, SCSA
Thomas Bigbee
Ranch Hand

Joined: Nov 29, 2001
Posts: 48


Originally posted by: john prieur

These techniques seem to be addressing concerns that might be beyond the scope of the project requirements. Are transaction processing activities required? (not.) Record-level locking should be achieved as described. A race condition can be avoided by writing to the filesystem within the lifetime of a "lock". (I don't know why we would ever not want to use NIO for this.) If there is no explicit requirement that display-only data in the GUI is still valid, there is no implicit transaction. Data for update should, obviously, not change by others while user makes an update.


There are a number of "requirements" that are not requirements, case in point, while searching this board, I have found instances of people failing because of something as simple as not implementing the 48 hour rule (which is not a MUST, but, just a statement). Every (multi-user) application that I've worked on in the last umpteen years has had to make sure versioning has been implemented in some way shape or form. Its never been a requirement "per-se", however, its always come up during testing. If I was the grader, I would be sure to check on this, anyone not implementing versioning would automatically fail.
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11424
    
  85

Hi Don,

Welcome to JavaRanch and this forum.

I think using a data cache for searching for records and retrieving individual records is easily justified - there is little work needed. However I would not risk the loss of data that could occur if you had a system crash.

Before going down the path of caching writes in real life I would be talking to the customer about what sort of server they are running, what sort of uptime they expect, what sort of UPS they are running, whether the server application will get a shutdown notification from the UPS (and then writing aditional code to handle that notification), and getting them to sign a huge disclaimer protecting me if they loose millions of dollars in dropped transactions.

Much easier to just write the data back to disk whenever there is an update - after all, the ratio of writes to reads should be small enough that this should not cause major delays.

Regards, Andrew

PS Swans or Eagles for the grand final on the weekend? I think Swannies.


The Sun Certified Java Developer Exam with J2SE 5: paper version from Amazon, PDF from Apress, Online reference: Books 24x7 Personal blog
Don Burke
Greenhorn

Joined: Sep 20, 2005
Posts: 14
Thanks for the replies guys, some good input from you all.

I'm definately not going with deferred updates now (read cache seems the way to go), its just not worth the risk as you said, the ration of reads / writes will be in our favour.

Versioning is something i didnt consider b4 hand, but i will look at it seriously this week.

Andrew, good thing ur a swans fan, it was a good win. I'm tipping the cowboys to get up this week.

yeha!!

db
Pawel Poltorak
Ranch Hand

Joined: Sep 21, 2005
Posts: 36
Hi,

This cache topic is very interesting. I was considering cache but haven't implemented it yet. I was concerned about memory problems, because theoretically a lot of records could cause application to crash with OutOfMemoryError. On the other hand, cache is faster.

Do you think that non-cache approach is somehow worse than cached one?

Thanks for answer,
Pawel


SCJP, SCJD
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11424
    
  85

Hi Pawel,
Do you think that non-cache approach is somehow worse than cached one?
Hmm, I am going to start by answering the opposite to your question: "Do you think that cached approach is somehow worse than non-cached one?":

From the instructions: "You will not receive extra credit points for work beyond the requirements of the specification." So you will not gain anything by implementing a cache, but you could conceivably loose marks if you make a mistake in implementing your cache.

Now having said that, a cache is easy to implement, and should give a significant performance boost.

Regarding your memory concerns - how big is a record? How big would the database have to grow before you would start having problems with physical memory? (and if the database really did contain millions of records, do you think that they would still be using your classes?) Conversely, since the majority of operations on the database are likely to be reads and searches, could you imagine how slow it would be if you didn't have the database cached?

Regards, Andrew
Pawel Poltorak
Ranch Hand

Joined: Sep 21, 2005
Posts: 36
Thanks for answer ,

I guess I won't implement cache. If any problems arise with the performance, Data class could be subclassed and cache implemented in a subclass.

Best wishes,
Pawel
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Reading entire db into memory and deferred update..bad idea?