Hi,
We have a
J2EE web application currently running in production on IBM Websphere 4.0(on windows 2000) with UDB 7.2 ( on Windows 2000) as the database and IBM HTTP Server 1.3.19. This application easily sustains 150 concurrent users.
We have migrated this application to IBM Webspere 5.1.1 and migrated the database to UDB 8.1 on Windows 2003 and IBM HTTP Server 1.3.28.
The application works fine with a very minimal load of 1 or 2 concurrent users but as soon as the load increases to 4 � 5 users we start getting the following message in the log
�TimeoutManage I WTRN0006W: Transaction 57415344:000000000000008700000001056ef56f518baea7691f0ecb02908710140d906973657276657231[] has timed out after 120 seconds.�
The entire pool of 50 connections is exhausted at this stage. We have ensured that all the connections are getting closed.
Can somebody please help us out on the matter?
Thanks in advance,
Roopmay
Following are the further details about the application and the things that have already been tried.
The application uses
Struts 1.1 in the frontend.
The application makes use of container managed transactions.
We have tried altering the parameters like out/reap time/unused timout/aged timeout on data source connection pool but the problem persisted
We also changed Total transaction lifetime timeout at transaction service property but no luck.
The problem was escalated to IBM and this is what they had to say
�The problem is caused by a single
thread holding on to multiple connections in different LTCs.
For example thread 58d20027 does this in trace_04.08.30_17.07.02.log. Gets a connection in LTC 4a11006a, then uses and closes the connection. This happens several times in LTC 4a11006a, until eventually there is a case where it gets a connection, uses it, then gets another one without closing the first connection. This can be seen by searching for "58d20027 SystemOut O" in the trace. This results in two physical connections being used by the thread.
A user transaction is then started, and another connection is used by the thread - bringing the count up to 3.
Finally the user transaction, and the LTC ends returning the connections to the pool.
There are no unclosed connections, and the connections are all returned to the pool eventually, however this is just one case where a single thread held on to 3 connections for a given period of time
The root of the problem is that each thread appears to tie up more than one physical connection, because more than one LTC is active on the thread. With 47 threads running, eventually the pool (max 50) is exhausted, and waiters build up. The connection wait time is set to 30 minutes, if a thread is unable to acquire a connection, it will wait that long before timing out - these threads are deadlocking themselves.�
IBM has clearly stated that the problem lies in our code and not in the application server.
As per this suggestion we ensured that we did not have multiple connections open at the same time by ensuring that we always follow the set-use-close
pattern for connection handling. But the problem still persists.
We also tried configuration of LTCs through local transaction extended deployment descriptors � we tried the settings for LTCs for EJBs and
Servlets as suggested on the Websphere information center as follows
a. set Local Transactions - Resolution-control to ContainerAtBoundary.
But the problem did not get solved. The application instead started throwing the following exception
Method cleanup failed while trying to execute method cleanup on ManagedConnection com.ibm.ws.rsadapter.spi.WSRdbManagedConnectionImpl@592c2577 from resource jdbc/wcmds. Caught exception: com.ibm.ws.exception.WsException: DSRA0080E: An exception was received by the Data Store Adapter. See original exception message: Cannot call 'cleanup' on a ManagedConnection while it is still in a transaction..
at com.ibm.ws.rsadapter.exceptions.DataStoreAdapterException.<init>(DataStoreAdapterException.java:217)
at com.ibm.ws.rsadapter.exceptions.DataStoreAdapterException.<init>(DataStoreAdapterException.java:171)
at com.ibm.ws.rsadapter.AdapterUtil.createDataStoreAdapterException(AdapterUtil.java:209)
at com.ibm.ws.rsadapter.spi.WSRdbManagedConnectionImpl.cleanupTransactions(WSRdbManagedConnectionImpl.java:2613)
at com.ibm.ws.rsadapter.spi.WSRdbManagedConnectionImpl.cleanup(WSRdbManagedConnectionImpl.java:2320)
at com.ibm.ejs.j2c.MCWrapper.cleanup(MCWrapper.java:1169)
at com.ibm.ejs.j2c.poolmanager.FreePool.cleanupAndDestroyMCWrapper(FreePool.java:490)
at com.ibm.ejs.j2c.poolmanager.FreePool.returnToFreePool(FreePool.java:315)
at com.ibm.ejs.j2c.poolmanager.PoolManager.release(PoolManager.java:1275)
at com.ibm.ejs.j2c.MCWrapper.releaseToPoolManager(MCWrapper.java:1678)
at com.ibm.ejs.j2c.LocalTransactionWrapper.afterCompletionCode(LocalTransactionWrapper.java:1091)
at com.ibm.ejs.j2c.LocalTransactionWrapper.afterCompletion(LocalTransactionWrapper.java:1025)
at com.ibm.ws.LocalTransaction.LocalTranCoordImpl.informSynchronizations(LocalTranCoordImpl.java(Compiled Code))
at com.ibm.ws.LocalTransaction.LocalTranCoordImpl.cleanup(LocalTranCoordImpl.java:1038)
�The problem is caused by a single thread holding on to multiple connections in different LTCs.