How can I code a servlet to recover from a network blip (we all have them occasionally) that causes all of the physical database connections to be severed?
Despite the fact that these physical connections are now stale, DataSource.getConnection will continue to return them like they are good which then results in a continuous stream of database errors. I know that restarting the servlet will work, but this means downtime and human intervention. There should be a way to re-establish these connections within the code. I can trap for exceptions that indicate this situation, but I don't see a way drop the stale physical connection within the connection pool and create a new one.
Generally, once an exception on the connection managed by a pool is detected, the connection pool marks the connection unusable and physically closes it and throws it away once it's been returned to the pool. So the problem is not that your invalid connection remains forever in the pool, but that after the network blip all connections currently in the pool will be invalid and subsequent requests will obtain invalid connections from the pool until all of them are tried, found faulty and expunged from the pool. (This might be implementation dependent, I've learned this much form the Oracle's documentation of its own implementation.)
Most connection pools can be configured to verify connections before handing them out, which would resolve this issue completely (I assume), but it has some overhead (the connection is verified by issuing a harmless SQL statement on it). Depending on your application, the overhead can be very high.
An untested idea of mine: if you don't want to have the connection validated by the pool at every request, you might put additional logic into your code where the connection is being acquired. You'd set a flag whenever a database error has occurred, and in that case you'd validate the connection after it was acquired from the pool, and if it was not valid, retry. When getting a valid connection eventually, you'd clear the error flag. This assumes that you can track an occurrence of SQLException throughout your application, and that during normal execution your application does not produce lots of these.
This might be too DIY. I'm no expert on this, I just had to solve a couple of connection pool issues myself and spent some time thinking and reading about them. Hope someone will be able to come up with a better solution.
I recall having this problem several years ago; I still have the code in production which is catching SQLException and retrying the operation if the class name of the exception ends with "StaleConnectionException". Our server has been upgraded several times since I did that and I have no idea whether the retry code actually does anything any more. It isn't broken, it's a low-volume application used only by people in my department, so there's no reason to review it.
My impression is that I haven't had to deal with this problem since I wrote that application, so maybe connection pooling has improved since then. Anyway I would certainly agree with Martin that investigating configuration options to make the problem go away is likely to be a better idea than building workarounds into your code. In my case I only had JDBC code in one place, so I only had to put my hack protective code in one place. If there were 78 servlets all with JDBC code, or if I had an ORM dealing with the database, it would have been another question entirely.