Recently we started getting a strange problem while restarting the tomcat.
After running "service tomcat restart" (or stop-start it actually doesn't matter) the server fails to start and the following is written to catalina.out :
No other software is running on port 8080 on the server. And I even check that there is no active connection before restart, and still get the same.
If I do the second restart of tomcat after several minutes it goes fine and quick, but if I try to do it after an hour it almost everytime fails.
Make sure you check on the right interface on a multi-homed machine. Also, there may be background processes automatically starting Tomcat servers. Finally, it is very easy for developers to make a mistake so that Tomcat will not terminate properly. Thus, after stopping Tomcat, check that is dead before starting. So don't do "restart", but "stop, check, start".
Before restart I am stopping monitoring software which watches for tomcat process (MONIT). So it can not be started automatically.
I stop the tomcat. Check the process with "ps -ef | grep java". Check the 8080 port with "netstat -t -a -n -v | grep :8080"
Everything is clean and I do start. And get the exception in logs.
Must admit that often stop command takes a long time to complete if it can give you a clue.
I am using s standard tomcat service script for starts/stops.
Tomcat shuts down in an orderly way. If one or more of the installed webapps isn't well-behaved, you'll have problems. One of the most common of these is webapps that run a scheduler service such as Quartz and don't shutdown the Quartz scheduler threads (in the servlet destroy() method). That will hang the Tomcat shutdown process forever - or until you forcibly terminate the JVM, since as long as even one thread is alive, the JVM cannot terminate.
Ordinarily the Tomcat shutdown process happens fairly quickly. The "Tomcat restart" process consists of a catalina stop, immediately followed by catalina start. Both of these operations are asynchronous - they don't wait for any sort of event to be posted before returning. So if shutdown takes too long, the catalina start may fail because the catalina stop is still running and hasn't yet released the server network ports.
The cure for that is to either make the webapp shut down faster, or to introduce a delay between stop and start.
Customer surveys are for companies who didn't pay proper attention to begin with.
Maybe something connected directly with Linux OS? Sockets left in TIMED_WAIT state or any tricky things?
I doubt it. I've worked with Tomcat since about 2.0. When the apps are all shut down, the rest of the server quickly follows. You might try the JMX console and see if there's something still happening. Or put the whole thing under a debugger and see what threads hang around the longest.
An easy way to tell if it's Tomcat or something in an app is to remove all the webapps and see if it still holds the port too long after shutdown.