I have encountered a problem of socket communication on linux system, the communication process is like below: client send a message to ask the server to do a compute task, and wait for the result message from server after the task completes.
But the client(using the ObjectInputStream.readObject() method) would hangs up to wait for the result message if the task costs a long time such as about 40 minutes even though from the server side, the result message has been written to the socket to respond to the client, but it could normally receive the result message if the task costs little time, such as one minute. Additionally, this problem only happens on customer environment, the communication process behaves normally in our testing environment.
there are both ObjectInputStream.readObject() and ObjectOutputStream.writeObject() methods invoked on Client and server.
when server invokes objectOutputStream.writeObject() to responds to the client, the corresponding objectInputStream.readObject() method on client could not receive, but hangs for ever. this objectOutputStream and objectInputStream is produced in one connection to the socket.
also there are objectOutputStream.writeObject() method on client to write request to the server where the ojectInputStream.readObject() is used to accept the request.
I have suspected the cause to this problem is the default timeout value of socket is different between customer environment and testing environment, but the follow values are identical on these two environment, and both Client and server.
in order to solve this problem, i have added the flush and reset method, but the problem still exists:
so do anyone knows what the next steps i should do to solve this problem. I guess the cause is the environment setting, but I do not know what the environment factors would affect the socket communication?
And the socket using the Tcp/Ip protocal to communicate, the problem is related with the long time task, so what values about tcp would affect the timeout of socket communication?
After my analysis about the logs, i found after the message are written to the socket, there were no exceptions are thrown/caught. But always after 15 minutes, there are exceptions in the objectInputStream.readObject() codes snippet of Server Side which is used to accept the request from client. However, socket.getSoTimeout value is 0, so it is very strange that the a Timed out Exception was thrown.
so why the Connection Timed out exceptions are thrown?
and From the JAVA API document about the setSoTimeout method, if this method is set a no-zero value, when the times expires, only the SocketTimeoutException was thrown, not the SocketException:Connection timed out, so, this exception should not relate with the setSotimeoutMethod.
[javadoc]public void setSoTimeout(int timeout) throws SocketException
Enable/disable SO_TIMEOUT with the specified timeout, in milliseconds. With this option set to a non-zero timeout, a read() call on the InputStream associated with this Socket will block for only this amount of time. If the timeout expires, a java.net.SocketTimeoutException is raised, though the Socket is still valid. The option must be enabled prior to entering the blocking operation to have effect. The timeout must be > 0. A timeout of zero is interpreted as an infinite timeout.[/javadoc]