aspose file tools*
The moose likes Threads and Synchronization and the fly likes Thread processing time Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Threads and Synchronization
Bookmark "Thread processing time" Watch "Thread processing time" New topic
Author

Thread processing time

Grishma Dube
Ranch Hand

Joined: Jul 01, 2003
Posts: 275
Hi All,

I am working on a Rule Engine Application, which is implementing Apache Lucene for indexing. The performance of this application is poor and incosistent.
We are trying to findout a bottleneck which is hitting the performance.

This application creates threads to process the records and for each 1lac records, it creates one thread. Now when I am running a test each time I've 32lacs records, so it exactly creates 32 threads and processing the same set of records but the processing time for each of these threads is different all the time inspite of the same records and same server configuration. The difference is more than an hour.

Could you guess the reason for this also. Processing time for each thread is depending on what all factors?

Thanks,
Grishma
Jim Akmer
Ranch Hand

Joined: Jul 06, 2010
Posts: 104
That depends. I do not know about Lucene, but generally if your threads execute a cpu-intensive task, you will have a lot of context swiching and the performance will degrade, if you create a number of threads that is far more than the cpu cores provided by your system. If your threads perform lot of I/O operation the performance can be better since most of the time threads will be blocked on I/O.
Vinoth Kumar Kannan
Ranch Hand

Joined: Aug 19, 2009
Posts: 276

By 'processing time' you mean, the time difference between the start & end of a thread?!
If so, it depends on the priority of threads. If you have set no priority for any of the threads then its up to the OS to do the time-slicing for the threads. It may choose to a round-robin on threads 2,5,18,22 & 10 and thereby finishing only 4 threads. Now, lets say thread 10 has been started but is on-hold now. The OS may decide to move to the rest of the threads. In this case, it may happen that thread 10 finishes last giving a total time of an hour or so.
You would not want to do an unequal compare on thread timings as it completely OS dependent & ofcourse you dont know the logic/algorithm.

If 'processing time' is actually the time taken by a thread to complete its job - the running time of a thread(you'll have to ignore the time when the thread waits), we can really do a compare. Again we cannot determine this too, as we dont know when our thread will actually be in running state & in runnable state.


OCPJP 6
Grishma Dube
Ranch Hand

Joined: Jul 01, 2003
Posts: 275
By processing time i meant, time difference between start and end of thread.

Issue here is each time i am running that rule engine (same server, same configuration, same # of records, same # of threads created) but every time the time taken is different. And believe me its huge difference.

1st run - took 81 mins to complete the whole job with 32 threads
2nd run - took 147.0 mins to complete the same job
3rd run - took 117.0 mins to complete the same job
4th run - took 160.0 mins to complete the same job

All the internal/external factors are same, still such a huge time difference in completing the same job in each run. I dont know, how to trubbleshoot this problem. Where should i exactly look for the problem - code, server configuration, CPU?
Sandeep Sanaboyina
Ranch Hand

Joined: Dec 14, 2009
Posts: 72
What exactly do you mean by record processing ?

May be you can check there for delay. Instead of the start and end time, you can see which step is taking up most time..


They say you have to be the first, the best or different. I say, is it too much to ask for all three.
Grishma Dube
Ranch Hand

Joined: Jul 01, 2003
Posts: 275
Hi Sandeep,

Each thread is processing approx 1 lac records...processing meaing with the help of Lucene, it does indexing and searching of the records..There are 32 lacs records and every 1 lac records, it creates 1 thread.

Hope this helps.

Regards,
Grishma
Grishma Dube
Ranch Hand

Joined: Jul 01, 2003
Posts: 275
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12769
    
    5
How close does CPU utilization approach 100%?

If all Threads are reading from a disk file, seems to me there will be a lot of disk thrashing as each Thread tries to get to a different file. Random variation in the disk reading sequences would account for the variation in time.

Do you read and process a single record or buffer in multiple records?

When I did something like this I had one Thread responsible for filling data buffers from disk while other Threads manipulated the data and wrote the results. Thus only one Thread was reading the disk.

Bill
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Thread processing time