wood burning stoves 2.0*
The moose likes Threads and Synchronization and the fly likes When utilizing all cores - speed of execution decrease significantly. Why? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » Threads and Synchronization
Bookmark "When utilizing all cores - speed of execution decrease significantly. Why?" Watch "When utilizing all cores - speed of execution decrease significantly. Why?" New topic
Author

When utilizing all cores - speed of execution decrease significantly. Why?

Adrian Burlington
Ranch Hand

Joined: Jun 16, 2009
Posts: 75
Something doesn't make sense in my code in terms of 'speed of execution': The computer I'm using is a power mac with 16G ram and 16 cores. The ReadTaskThread simply goes to the DB to retrieve information (SELECT). This returns a list of items and apparently it takes the same time to fetch 10 items or 1000 items (give or take 2 sec).

problem 1: When I use nProcessors=2 cores I get the best performance (execution in 1.2min) . When I use nProcessors=3 cores and more, the execution is 7min + (worst).
problem 2: When the list of items is 1000, the speed to process in the 'consumer' takes some time and as a result the producer has more time to fetch the new data from the db. When the list of item is small - the delay is HUGE because the consumer is waiting for the producer to get the data.

Question 1: I was under the impression that the more utilized processor the better, why this is not the case in the senario I presented?
Question 2: Is newFixedThreadPool is the right one to use?

Thank you for any pointers!


Mike Peters
Ranch Hand

Joined: Oct 10, 2009
Posts: 67

When reading from disk your processing power is probably not your bottleneck. To improve the performance in this situation you may add additional disks for parallel disk access.


Mike Peters
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18896
    
  40


Also, unless you are using an embedded DB, you may want to look at your database server too. The taking 4 times longer with an extra request in parallel seems weird to me -- unless of course, you are doing more work with the extra request.

Henry


Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Steve Luke
Bartender

Joined: Jan 28, 2003
Posts: 4181
    
  21

And I also don't think you are consuming the data very efficiently. You are consuming the data in the order in which they are submitted, rather than the order in which they return. Perhaps you should change the way the tasks object is used. For example, make it a BlockingQueue implementation, and pass it to both the producer and consumer. The Producer can then put tasks into the queue directly, and the consumer takes tasks in the order they are available which may decrease time spent waiting for a specific producer.

One last thing... you might consider working on the balance between Producer and Consumer as well. Starting with your current thread pool, turn your consumer into a Runnable which gets executed in the pool (takes one Thread from the Producer) and measure the results. Then add a second Consumer, and a third, etc... and see if there is some balance which optimizes performance. If the Consumer's job is processor-intensive, it could be that having multiple running while each producer is waiting on the database may provide better performance. And you don't necessarily have to limit yourself to 16 threads total. If your DB task is talking to a remote DB, and the DB operation takes some time then the Producers aren't using the processors while they wait on the DB - perfect time to allow something else to use the processor, like perhaps a Consumer or another Producer.


Steve
Mike Pukuotukas
Greenhorn

Joined: Oct 08, 2010
Posts: 6
New object allocation still may be a difficult task when many cores are used. While with one or two cores it may not be important (even not recommended) to care very much about object reusing, with 10 or more cores "traditionally inappropriate" approaches like object pooling and reusing may help with performance. The best is to allocate all needed objects outside the critical section and do not allocate any new objects in the main loops. We were able to speed up some programs up till several times after fixing these issues.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: When utilizing all cores - speed of execution decrease significantly. Why?