aspose file tools*
The moose likes Tomcat and the fly likes Creating threads in Tomcat Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Products » Tomcat
Bookmark "Creating threads in Tomcat" Watch "Creating threads in Tomcat" New topic
Author

Creating threads in Tomcat

James Davison
Greenhorn

Joined: Mar 28, 2004
Posts: 27

I was recently asked to modify a web application that initiates a very long running task that generally results in correspondingly long user wait time for a page to be displayed. The task itself consists of smaller, independent, individual tasks that, ideally, would be spawned as separate threads and aggregated in a parent task. However, I know creating threads in the servlet container is not recommended.

I have done quite a lot of research on the web and, frankly, while I have uncovered some possible solutions, there seems to be no consensus on the best way to implement a solution for such a scenario. It seems that I am not the only one who has faced such a problem. JSRs 236 and 237 called for a solution but appear to have gone nowhere. JSR 342 (Java EE 7) may resuscitate JSR 236 but I cannot wait. I need a Java EE 5/6 solution.

The CommonJ Work Manager implementation may be a possible solution but is this the way to go? Has anyone else used the CommonJ Work Manager API? If so, can you recommend it?

The Quartz Scheduler seems to be highly recommended but this is NOT a task that may be scheduled. It is an on-demand task and may be requested perhaps 1 to 3 times per day or perhaps not all for a particular day.

Ideally, I would like to use the java.util.concurrent API introduced in Java SE 5. Has anyone else done used this API in Tomcat? If so, can you recommend it? Assuming Java EE 7 does, in fact, eventually implement JSR 236, it seems like this may be the best approach. When the Java EE 7 API becomes available, the code could simply be modified to use the corresponding managed Java EE API classes.

FYI, I am NOT concerned about exceptionally long running tasks or renegade threads. While, as a developer, I would never say never, those scenarios will NOT occur in this particular application. Although RELATIVELY long running, each individual task would generally never exceed 1 second in elapsed time. What makes the parent task long running is that there may be thousands of the individual tasks. If not for the prohibition against creating threads in the servlet container, I would simply create a class that implements Runnable, execute a limited number of threads, perhaps 10, at any given time and aggregate the results in a parent class as each individual task completes.

Assuming I am able to implement a threading solution of some sort, what I would like to do is return the page immediately after submitting a job thread to improve user response time then use Ajax calls to periodically check the job status on the server and update the page as results become available.

I have actually seen applications that have created their own threads in Tomcat and not experience any problems (as long as the thread execution time is short and the thread does not fail). However, I have always been leery of such implementations because I know it is not recommended. I would like to implement a solution for this scenario using the recommended best practices. The only problems is I am having trouble ascertaining what those best practices are.

I would appreciate any recommendations on how to approach a solution to this problem.




James Davison
Greenhorn

Joined: Mar 28, 2004
Posts: 27
I must not be only one interested in the java.util.concurrent functionality in Apache Tomcat. For anyone else interested in similar functionality, I found the following classes in Apache Tomcat 7.

ThreadPoolExecutor.java
TaskQueue.java
TaskThread.java
TaskThreadFactory.java
Constants.java

These classes are based on the java.util.concurrent package and are used internally by the Apache Tomcat team, for example, StandardThreadExecutor.java. If it works for the Apache Tomcat team, it should work for me.

I backported ThreadPoolExecutor.java and the associated 4 utility classes to my application, removing only the logging lines from ThreadPoolExecutor.java to remove the logging dependency. The Apache Tomcat ThreadPoolExecutor implementation of ExecutorService interface appears to be exactly what I was looking for.



Tim Holloway
Saloon Keeper

Joined: Jun 25, 2001
Posts: 16145
    
  21

The reasons that Http Request/Response server threads may not spawn threads are twofold:

1. If you return the parent thread to the pool while one or more child threads exist, the expected symmetry of resources in the pool is destroyed. Also, if the new user of the thread does something unexpected, it can terminate child threads that were completely unrelated to the current request.

2. If you DON'T return the parent thread to the pool, you starve the pool by typing up threads.

The most common way to manage on-demand long-running processes is to construct an engine in the ServletContext listener that runs when the web application launches. This engine can contain as many threads as it likes, depending on how many of these long-running processes you want to run in parallel. In fact, if you're really into such things, it could construct an entire private thread pool.

Feeding this engine is a synchronized interface that permits users making HTTP requests to queue up and interrogate work requests made to the engine. It has to be synchronized, because the request and the engine are running in separate threads, so you would otherwise have concurrency problems.

You are also at liberty as to how clients get informed on completion. Shorter processes might simply be interrogated by repeated requests (for example, a web page with a timer-fired AJAX query). Longer processes might want to email the client when the work is done and perhaps send a link to the final status display page.


Customer surveys are for companies who didn't pay proper attention to begin with.
James Davison
Greenhorn

Joined: Mar 28, 2004
Posts: 27
Tim,

The most common way to manage on-demand long-running processes is to construct an engine in the ServletContext listener that runs when the web application launches. This engine can contain as many threads as it likes, depending on how many of these long-running processes you want to run in parallel. In fact, if you're really into such things, it could construct an entire private thread pool.


I have seen examples of creating an ExecutorService in the context listener. However, is that really a best solution for my scenario? Perhaps it will not consume many resources but I do not see the point of keeping an ExecutorService object in memory for an on-demand task. The task is rarely invoked more than once a day. If an ExecutorService can be created in the servlet context, why not just create it as needed? Or am I wrong?

Also, if one were to create an ExecutorService in the servlet context, I would assume one would need to create as many ExecutorService objects as there are potential multi-threaded tasks in the application? If I am successful, there are other long-running processes in this application that we may wish to address also. Or can a single ExecutorService object handle tasks from potentially multiple batch jobs without the jobs interfering with one another? If so, I see why one would create a single ExecutorService object in the servlet context.

And, if so, perhaps it makes sense (in some environments) to maintain two (2) ExecutorService objects in the servlet context - 1 for tasks that must run in a specific order, i.e. a single-thread service, and 1 for task that may run independently, i.e. a multi-threaded service?

Finally, assuming an ExecutorService object is created in the servlet context, would it not make sense to use the Apache Tomcat implementations of ExecutorService, i.e. ThreadPoolExecutor, and ThreadFactory, i.e. TaskThreadFactory?

Thanks for your response!
Tim Holloway
Saloon Keeper

Joined: Jun 25, 2001
Posts: 16145
    
  21

The ideal HttpRequest processor function runs as quickly as possible and doesn't leave any detritus behind so that maximum throughput is attained and processes won't randomly die owing to lint left in the thread from previous users. That is why spawning threads in processor functions is discouraged.

The overhead for running an independent engine anchored off the ContextListener is really not that great, assuming you design a minimal engine and when its thread(s) are waiting for work, they consume no more CPU resources than any other idle thread (such as the processor threads). How much memory they consume is dependent on what resources you hold while the process is idle. A basic engine with just a thread and a synchronized List object as its work queue would probably be no more than a few hundred bytes on average, not counting the business logic, which you've got to have regardless of who runs it.

You also gain a few other advantages this way. First, for most purposes, the ContextListener-based engine subsystem is practically an independent app. It can be as simple or as complex as you need, can control one thread or dozens (including load-balancing, if you like), the threads can safely die without compromising basic web request processing. There's less risk of it becoming an albatross, since it's not some kinky non-standard logic - and because it's not violating standards, less subject to being turned into an outright menace by some clueless maintenance programmer.

In one case, I once had to go one step further than even all that. I was responsible for a system that would process a request for literally days at a time. I had to move that one completely out of the webserver, since otherwise any problems that required restarting the webserver would be held hostage by the batch processor (and vice versa). I put that one in a separate stand-alone Java application and used RMI for web-to-engine communications and control.
James Davison
Greenhorn

Joined: Mar 28, 2004
Posts: 27
Tim,

Okay, I think I am beginning to understand the advantages of using a ServletContext listener implementation. I agree, the overhead is probably minimal so I am not overly concerned about that. I do have one question for you for my clarification. If an ExecutorService object is anchored to the ServletContext listener, do the threads it spawns run inside the Servlet Engine or in a separate process? I am asking to improve my understanding of how this implementation will work.

Now that I have had some time to consider a ServletContext listener implementation, I have tentatively decided there should probably be three (3) ExecutorService objects maintained in the servlet context (in a very generic implementation), i.e.

1) A single-thread executor - when jobs MUST be run in a specified order.
2) A multi-thread job executor - e.g. with a thread pool size of 3, that limits the number of batch jobs that may be run at one time - all batch jobs would be submitted to this executor.
3) A multi-thread job task executor - e.g. with a thread pool size of 10, that limits the number of (batch job) tasks that may be run at one time - this executor would receive tasks from the single or multi-thread job executors, i.e. 1) or 2) above.

Such an environment would seem to me to allow the greatest flexibility for submitting and running batch jobs. Since I am proceeding with a ServletContext listener implementation, what I would like to do is design an implementation that will handle most scenarios.

Would you consider this a good design or should I pursue another approach?
Tim Holloway
Saloon Keeper

Joined: Jun 25, 2001
Posts: 16145
    
  21

Listener-spawned threads are technically part of the "Servlet Engine" in that they share the Servlet Context. But they are not children of servlets, nor are they children of any of the threads in the servlet/jsp processing thread pool. Their parent is the engine as a whole. Which exempts them from the restrictions on threading that servlets themselves have.

You do need to put in a shutdown listener that stops these threads, since the JVM (and therefore Tomcat) cannot shut down until all threads are terminated. Tomcat isn't going to terminate them. Usually, anyway. Tomcat 6 may get annoyed waiting for shutdown, forcibly purge the threads and warn you, but that's not the way to program.

I can't figure out what your third engine architecture is good for. My 2 most likely implementations are:

1. Single-thread only. Thread sleeps waiting for queue posting, pulls head request from queue, runs request, posts status, looks for another entry in queue, repeats until all requests serviced, then goes back to sleep. Often I'll have 3 queues: input (scheduled), working, and finished and move the request from queue to queue at the various stages in processing. Working "queue" is a single element.

2. Multi-thread. Master Dispatcher thread creates thread pool (or spawns threads on-demand). Incoming requests wake up the dispatcher, which pulls a thread from the pool (or spawns one) up to a pre-set limit, then goes back to sleep, having handed the request to the worker thread. Worker thread runs request, posts itself complete, which wakes up dispatcher, who returns worker thread to the pool, reassigns the worker thread to a pending request (in cases where there are more requests than workers), or destroys thread. Your choice.

You can do many refinements on these, such as different request priority levels, optional versus must-run-immediately, load balancing based on resource requests, and so forth. But that's the basics.
James Davison
Greenhorn

Joined: Mar 28, 2004
Posts: 27
...Their parent is the engine as a whole. Which exempts them from the restrictions on threading that servlets themselves have.


Tim,

Thanks so much for explanation! I feel like I am really starting to get a handle on this. FYI, I have do invoke shutdownNow() on all executors in the listener's contextDestroyed() method.

FYI, my idea behind the third executor service was simply to limit the maximum number of threads that could run at one time, in my example 14 (1 + 3 + 10). From your response, it is not unlike your (2.) "Multi-thread" example description. My second executor would be the "Master Dispatcher" that spawns child (task) threads in the third executor. My intention was to enable up to 3 batch jobs to executely concurrently with additional batch jobs getting queued. Those 3 jobs, in turn could spawn up to 10 individual worker threads, as required, to perform individual tasks.

Does that clarify my intentions?
Tim Holloway
Saloon Keeper

Joined: Jun 25, 2001
Posts: 16145
    
  21

A word of caution: The Thread.cancel() method has been deprecated a long, long time. The recommended way for a long-running process to deal with external shutdowns is for it to query a status indicator that gets set when shutdown is initiated. In the save of a ServletContextListener, that would be in the shutdown listener. So, for example, a long-running file copy might check the engine status every 5000 records or so to see if the machine had been switched to a shutdown state, and if so, would stop copying and clean itself up in an orderly manner.

For your case #3, there's no inherent restrictions on whether your "job" worker threads spawn child threads as long as they tidy up before they're done. The reason servlet processors cannot do likewise is that the servlet requests are expected to consume a minimal amount of time. The engine workers aren't.
James Davison
Greenhorn

Joined: Mar 28, 2004
Posts: 27
Thanks Tim!
James Davison
Greenhorn

Joined: Mar 28, 2004
Posts: 27
One more question: If create a singleton around the ServletContext, is that the best way to access the ExecutorService objects in the application?
Tim Holloway
Saloon Keeper

Joined: Jun 25, 2001
Posts: 16145
    
  21

I'm not sure what you mean by "around the ServletContext".

I fact, I think you're probably better off creating a new thread and asking there.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Creating threads in Tomcat