GeeCON Prague 2014*
The moose likes Threads and Synchronization and the fly likes Hung Thread Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Java » Threads and Synchronization
Bookmark "Hung Thread" Watch "Hung Thread" New topic
Author

Hung Thread

Saul Tocsin
Greenhorn

Joined: Feb 25, 2012
Posts: 14
Hi All,

I've spent time browsing this forum, but this is my first post.

I am trying to think through a scenario that sees a long-running thread, started by an Executor, get "permanently" hung. We don't know why this happens. Maybe it's blocked in I/O. Consider it a "given." Supposing that one can detect that the thread is hung, how can one truly cancel it?

I am familiar with the reasons behind the deprecation of Thread.stop(). The replacement approach seems to rely on the target thread (the one you want to stop) being active, i.e., the target thread periodically checks a variable that is shared with another thread and simply exits its run method. But this doesn't work for a hung thread. Moreover, it is simply not the case that a thread is always able to see a Thread.interrupt() presented by another thread.

To wax philosphic for a moment, the JVM has in some sense replaced the real OS. That is, it functions as a mediating and insulating layer between our Java progams and the kernel. But the kernel can always cancel one of its tasks. I have the sense that the JVM is deficient in this regard and that there really is no graceful solution to certain types of thread states.

So I am interested in hearing folks' thoughts about ways of handling this problem. I looked into creating the long-running task as a Future in the hope that its cancel() method might be of some help. But even here I sense some uncertainty, e.g., the docs say: "Attempts to cancel execution of this task. This attempt will fail if the task has already completed, has already been cancelled, or could not be cancelled for some other reason." (emphasis mine).

Thanks.

-Saul

Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18875
    
  40

Saul Tocsin wrote:
I've spent time browsing this forum, but this is my first post.


Welcome to the JavaRanch !!

Henry

Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18875
    
  40

Saul Tocsin wrote:
I am trying to think through a scenario that sees a long-running thread, started by an Executor, get "permanently" hung. We don't know why this happens. Maybe it's blocked in I/O. Consider it a "given." Supposing that one can detect that the thread is hung, how can one truly cancel it?

I am familiar with the reasons behind the deprecation of Thread.stop(). The replacement approach seems to rely on the target thread (the one you want to stop) being active, i.e., the target thread periodically checks a variable that is shared with another thread and simply exits its run method. But this doesn't work for a hung thread. Moreover, it is simply not the case that a thread is always able to see a Thread.interrupt() presented by another thread.


There is no easy answer here. There are definitely cases where it will take some work, in order to get out of this situation.

As you mentioned...

1. You can periodically check a flag.
2. You can have the thread be interruptable.

but that doesn't solve all cases. Other options are...

3. You can configure your I/O to have timeouts (for example sockets support a connect timeout).
4. You can have the thread support I/O interrupts, so that interrupt can interrupt even I/O calls.
4a. If the JVM doesn't support I/O interrupts, you can have it just support I/O exceptions. Closing a socket or file, while it is being used will also generate an exception.

or ...

5. You can support a combination of all of the above.
6. You can design the application to tolerate being hung for a bit, but will check flags prior to doing anything else when/if it comes back.

Henry
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18875
    
  40

Saul Tocsin wrote:
To wax philosphic for a moment, the JVM has in some sense replaced the real OS. That is, it functions as a mediating and insulating layer between our Java progams and the kernel. But the kernel can always cancel one of its tasks. I have the sense that the JVM is deficient in this regard and that there really is no graceful solution to certain types of thread states.

So I am interested in hearing folks' thoughts about ways of handling this problem. I looked into creating the long-running task as a Future in the hope that its cancel() method might be of some help. But even here I sense some uncertainty, e.g., the docs say: "Attempts to cancel execution of this task. This attempt will fail if the task has already completed, has already been cancelled, or could not be cancelled for some other reason." (emphasis mine).


Not sure what you mean by "the kernel can always cancel one of its tasks". All the issues mentioned, like hung on I/O also applies at the OS layer, and the same techniques also needs to be used. Or do you mean something specific that hasn't been mentioned yet?

Henry
Saul Tocsin
Greenhorn

Joined: Feb 25, 2012
Posts: 14
Hi Henry,

Thank you for the welcome, and for the reply.

The difficulty here is that I don't have much control over the I/O layer. Imagine that the thread in question is in the "business" layer and that it calls into the lower DAO/Persistence layer via some kind of CRUD procedure call connector. So I don't think that my Java thread is in a position to create and benefit from an InterruptibleChannel in presenting such a call to the layer below (please correct me if you think otherwise).

More generally, I meant the "blocked in I/O" merely as an example. The general problem is "thread hung;" don't know why, it's just not responding.

Thanks.

-Saul
Saul Tocsin
Greenhorn

Joined: Feb 25, 2012
Posts: 14
Henry Wong wrote:
Not sure what you mean by "the kernel can always cancel one of its tasks". All the issues mentioned, like hung on I/O also applies at the OS layer, and the same techniques also needs to be used. Or do you mean something specific that hasn't been mentioned yet?


Hi Henry,

Simply that the kernel can:

a. detect that a task is not dispatchable, i.e., not runnable because it's "blocked" waiting for something
b. impose a policy that says "cancel a task if it is blocked for N minutes"; or notify the operator and let him cancel the task from without
c. remove the canceled task's entry from the dispatch queue
d. reclaim resources associated with the canceled task

I don't mean that the kernel can resolve the initial cause of the hang, e.g., a malfunctioning device.

The JVM architecture seems deficient re (b) and (d). There seems to be no way to cancel such a hung task, etc. Please note that this post is not meant as an "indictment" of the JVM architecture! I only bring it up because I am struck by what seems to me to be a liability. I agree that it's a difficult problem.

I am just trying to find a graceful and not too labored solution to it.

Thanks again.

-Saul

-Saul
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18875
    
  40

Saul Tocsin wrote:
The difficulty here is that I don't have much control over the I/O layer. Imagine that the thread in question is in the "business" layer and that it calls into the lower DAO/Persistence layer via some kind of CRUD procedure call connector. So I don't think that my Java thread is in a position to create and benefit from an InterruptibleChannel in presenting such a call to the layer below (please correct me if you think otherwise).

More generally, I meant the "blocked in I/O" merely as an example. The general problem is "thread hung;" don't know why, it's just not responding.



Unfortunately, you are at the mercy of the underlying layer that you are using. This is an issue, even if you are a C program that is using a library (which is directly making OS system calls) -- if the library hangs, your only option is to hope that it can be configured to not hang.

Henry
 
wood burning stoves
 
subject: Hung Thread