I would like to know best approach for concurrent updates.
I have searched this forum and googled and came to following solutions:
1. Create a column lastupdatetime and use that disallow concurrent updates
2. Use Isolation level of Repeatable Read
For sol1, I am not sure how it will resolve concurrent updates when you are doing bulk updates. Soln presumes that you are updating record one after another(which is not true always)
Soln 2, leads to restricting locking and thus would have performance impact.
I would like to know from experienced people here how they have designed it in their applications.
My application will be a web application with high volume and thus likelyhood of concurrent updates.
Keeping this in mind, can anybody else share their thoughts regarding an optimum design for this situation.
I don't believe there are easy solutions to prevent concurrent processes from updating conflicting set of rows while ensuring a sensible output. It does not matter whether the processes are automated or user initiated/performed. If you have different processes that might conflict in the database, you'll probably need to design a way to coordinate them. Otherwise the "last one wins", as has already been mentioned and in general you system will seem to be unpredictable. The coordination might be based on database locking in some ways, or it might be purely operational (users are instructed to perform their work in a way that does not cause these conflicts to happen, for example).
Just setting the isolation level not only fails to resolve the issue (see "last one wins"), but is highly database dependent, as different databases process isolation levels differently. As different databases lock rows or escalate locks in different ways, a pattern that works in one database might cause deadlocks or incorrect results in another, for example.
Joined: Apr 17, 2011
Thanks Martin for the explanation. I didn't understand when you say that isolation level would not solve the issue. Isn't there a standard set of isolation levels and the behavior would be consistent if the vendor complies to jdbc spec.
Why do you say that the behavior of isolation level will be dependent on the database implementation. But then this is where "design by contract" comes into picture. If a vendor says that their db driver is jdbc spec compliant then we would always get consistent behavior.
I want to be practical and I think I am of bunch of databases (Oracle, Sybase, DB2). Isn't it true that these companies driver implementation would honor jdbc spec.
I'm not an expert on JDBC specification, but I'd say the differences are mostly in the areas which JDBC specification does not cover. Though JDBC describes isolation levels (remember, not all databases support all isolation levels to start with), the implementation details on which you might (possibly inadvertently) rely are not covered. There are databases (SQL Server or MySQL for example), where repeatable reads are ensured by read locks. Other databases (Oracle for example) provide repeatable reads by multiversioning and therefore do not ever use read locks. A system which relies on read locks to ensure proper concurrent control might therefore work on MySQL, but it will provide unexpected results on Oracle, because reads won't block writes there. On the other hand, program that works well on Oracle might be unworkable on SQL Server due to additional read locks that might lead to inadequate performance or even deadlocks. Even where the locking mechanism is similar (write locks), differences might lie in lock escalation strategy. Some databases force you to commit more often that you otherwise would like, to work around concurrency issues. And so on.
To be able to design and develop reliable multiuser DB applications, one needs to know much more than just the JDBC specification. Deep knowledge of the database (or all supported databases, if you strive for database independence) is a must.