I wanted to understand in what scenarios can db updates throw a table lock exception. I want to design an app which supports concurrency to it's full extent. It would be helpful if someone could tell me "when" table locks could create problems and "how to code keeping in mind such scenarios." For instance is it possible for two threads to simultaneously update and delete the same table or "operate" on the same table. What actually happens when i invoke an "executeQuery and executeUpdate" from my java program.From the terms of database operations when dealing with transactions ie. the lock is a db concept for dealing with concurrency. Any other concepts like the maximum number of concurrent "connection" to a database. What is the rationale behind that?
This is actually a topic which is highly database-dependent. There isn't single answer to any of your questions which would hold true for all database implementations out there. Especially if your goal is to take the concurrency "to the full extent". You'd probably need separate implementations of your logic for every kind of database out there (one for MySQL, one for Oracle, one for MS SQL Server, one for DB2, ... you see), and possibly even to different versions of the same database. Just to illustrate the scope of the problem, let's see how Oracle and MS SQL Server differ regarding concurrency (please note that my info on MS SQL Server may be outdated and inaccurate, but for general comparison this should hold true):
In Oracle, readers never obtain any locks. Writers do obtain locks, but these locks block only other writers, not any readers. It is possible to explicitly lock a whole table (or partition), but generally the standard operations (inserts, updates, deletes) only lock rows and these locks never escalate. Readers always see the database as it existed when the query (or transaction) began. In case two updates collide, an update might be restarted; generally this is not anything to worry about, but if you aim at highest possible concurrency, you should know and understand this phenomenon. Of course, two (or more) writers can deadlock each other.
In MS SQL Server, there are read locks and write locks. The consistency of database queries (if required by the active isolation level) is maintained by having readers lock rows they've read, so that these rows cannot be updated. Readers can therefore block writers and writers can block both readers and writers. Furthermore, the locks can and will escalate - if number of locks on a table goes too high, several row locks can be replaced by one block lock and even by a table lock if too many blocks are locked. Deadlocks may occur not only among writers, but among readers too. (Later versions of MS SQL Server support multiversioning like Oracle does, but I have no idea how their solution compares to Oracle's. Given that Oracle has been built straight up from multiversioning concept, while in SQL Server this is a functionality built on top of already existing database, I'd say differences will exist even there.)
Of course, there are many other subtle differences, and every one of them might influence the problem you're seeking to solve. Am approach that would perform adequately in one database environment might perform miserably (or outright fail) in another.
A blanket target of maximum concurrency in all environments is therefore impossible to achieve. You'd need to specify some quantitative criteria you need to meet (eg. desired number of transactions per second), specify all target environments and then design your system in a way that would allow these criteria to be met. Depending on these needs, a common logic for all supported databases might be viable. Or maybe it won't be possible, and you might have to even design the database schema differently on different systems to achieve your goal, or reduce the requirements or target environments.
If you simply want to design an app that performs acceptably on a wide variety of platforms, some ORM framework might be actually quite good for you, as it does address concurrency and other issues in all supported databases somehow. However, since these frameworks have to support various databases, in every supported database it is probably possible to achieve higher performance/concurrency using approaches specific to that database.
Joined: May 27, 2010
Pretty neat reply! Could you tell me where could I get a high level view of the mysql database implementation. Somewhere besides the manual. Also if someone could explain how jdbc works with mysql with respect ,to if a thread executed a cReate,upadte,delete operation on a table at the same time as another does something similar, would it fail or go through? If it fails is there a way around this? Finally, how do maxConnections variable operate. Can it be visualized as max client requests to server?
I can't help with MySQL. However, I wouldn't rule out reading the manuals outright - it is the foremost resource. Oracle database manuals are excellent and most of my knowledge comes from there. Given that MySQL should be fit for professional applications, the manuals should be adequate too.
Regarding the basic operations (select, insert, delete, update), JDBC should behave identically to any other database client. Especially, it should not lock more or less than anyone else. Generally, when a database client requests a resource (eg. a lock) held by another client, it waits till the resource becomes free. Whether there are any timeouts or deadlock detection is database specific. If you aim for high-concurrency OLTP application, I'd expect that you'll make sure no deadlocks occur and will use short, quick transactions, so the basic locking behavior should be generally adequate to you.
Also, in Oracle it is necessary to use PreparedStatements with bind variables to achieve maximum OLTP performance. In MySQL I'd expect the same. Even if it didn't bring the performance benefit, it makes your application safe from SQL injection attacks, so remember to always use bind variables.
Maximum number of connection simply limits how many clients can be logged in to the database at once. A connection need not be active all the time, but it can be, so you should not let more connections in than your database can handle (generally, a few CPU box cannot handle hundreds of connections, it might struggle even with few tens of active connections, but CPU need not be the only bottleneck in your system). Connection is an expensive resource and initiating the connection takes some time, which is why you usually have some kind of application connection pool nowadays.