This week's book giveaway is in the OCAJP forum. We're giving away four copies of OCA Java SE 8 Programmer I Study Guide 1Z0-808 and have Jeanne Boyarsky & Scott Selikoff on-line! See this thread for details.
I am quite new to using concurrency in Java, so please forgive if this is a trivial question.
I would like to make use of something like pool=Executors.newFixedThreadPool(n) to automatically use a fixed number of threads to process pieces of work. I understand that I can asynchronously run some Runnable by one of the threads in the threadpool using pool.execute(someRunnable).
My problem is this: I have some fixed amount of N datastructures myDS (which are not reentrant or sharable) that get initialized at program start and which are needed by the runnables to do the work. So, what I really would like to do is that I not only reuse N threads but also N of these datastructures to do the work.
So, lets say I want to have 10 threads, then I would want to create 10 myDS objects once and for all. Each time some work comes in, I want that work to get processed by the next free thread, using the next free datastructure. What I was wondering is if there is something in the library that lets me do the resusing of threads AND datastructures as simply as just reusing a pool of threads. Ideally, each thread would get associated with one datastructure somehow.
Currently I use an approach where I create 10 Runnable worker objects, each with its own copy of myDS. Those worker objects get stored in an ArrayBlockingQueue of size 10. Each time some work comes in, I get the next Runner from the queue, pass it the piece of work and submit it to the thread pool.
The tricky part is how to get the worker object back into the Queue: currently I essentially do queue.put(this) at the very end of each Runnable's run method but I am not sure if that is safe or how to do it safely.
What are the standard patterns and library classes to use for solving this problem correctly?
I don't understand why you want to share the data structures. Normally I should have a data structure available for each worker as an attribute. What is the reason that you don't want this?
Joined: Mar 28, 2008
Sorry for not explaining this clearly ... I do not want to share the data structures, I want to re-use them. All the examples of using a thread pool create new workers for new work that arrives, but in my case, I need to use those datastructures for doing the work. So lets say I have 10 such datastructures, I can have 10 threads, each using one of those. The problem is how I can make sure that once a thread is finished, the datastrucutre it has used can be used by the next new thread.
So, the thread-pooling library methods do the managing of threads for me, but I did not find a way how to at the same time manage the data structures. Ideally, I would just associate each thread with exactly one datastructure but I simply could not figure out how that can be done with Executors.newFixedThreadPool(n)
So I ended up doing the management of the datastructures myself using the ArrayBlockingQueue while the threads are being managed by the FixedThreadPool. That seems to work but is somewhat unsatisfactory to me -- if I need to do a thread-safe management of a pool of datastructures myself I could just as well manage the threads myself that way.
On the other hand I have the feeling that associating each thread with each own copy of some datastructure must be something that is needed so often that maybe there is a way how this is supported in the library and I am just missing it.
How about when you pass the runnable to the execute, set the appropriate data structure for that runnable.
Joined: Mar 28, 2008
Jim Akmer wrote:How about when you pass the runnable to the execute, set the appropriate data structure for that runnable.
The problem with setting my datastructure when a new runnable is created is almost identical with what I descibed in my previous post: since i have to set one of N fixed datastructures, before I can set it, i must be sure that the thread that used it before has safely released it. So instead of managing a fixed set of runnables, I have to manage a fixed set of datastructures with the same multithreading issues: I have to make sure that each runnable, when it completes successfully or in error hands back the datastructure to some pool without interfering with some other thread handing back another datastructure. So, the basic problem remains the same: I need to do some thread-safe pool management myself abd would really prefer a library function similar to the one managing my thread pool already to do that.
Are there any 3rd party (opensource) libraries that would do what I want? I simply cannot believe that the wheel has actually to be reinvented for this...
I have to make sure that each runnable, when it completes successfully or in error hands back the datastructure to some pool without interfering with some other thread handing back another datastructure
Have you considered placing your data structures in a queue? Once the thread is ready to run, it does a pop in the queue which removes the data structure from the queue. So there is no interference among the new thread since if the old thread is not finished, the data structure is missing from queue. Once thread finishes, thread pushes data structure to queue. Next thread can safely work
Johann Petrak wrote:...currently I essentially do queue.put(this) at the very end of each Runnable's run method but I am not sure if that is safe or how to do it safely.
This is the correct thing to do, what you should do is wrap all of the work in a try block, and put the queue.put() in a finally block. I usually do this sort of thing in a 'wrapper' type of situation, which automates it so I don't forget to type it in:
Then you need a 'DataUser' interface with a setData() and getData() methods, so we can inject the data as needed. You could then run the thing like this:
So the DataUser is the interface you use per task. You override the getData/setData as needed and the run() method like you would a normal runnable. You can also build this into an easier to use (though maybe harder to understand) construct. For example, you could (as you suggested) combine the data injection and the thread pool into a single unit. Below is the code:
First, the DataUser interface again...
Then, an interface for the data injecting executor service which adds a new method for tasks that need data injection:
The next chunk of code is an implementation of the above interface. It uses the AbstractExecutorService as a backbone, and uses an external ExecutorService as the thread pool (so you can customize behavior by passing in a different ExecutorService, rather than modifying/extending this class). Note that I moved the DataGrabbingTask class into this class. No need for external users to generate of even know about it. The data handling will all be done internally.