I am working on a Rule Engine Application, which is implementing Apache Lucene for indexing. The performance of this application is poor and incosistent.
We are trying to findout a bottleneck which is hitting the performance.
This application creates threads to process the records and for each 1lac records, it creates one thread. Now when I am running a test each time I've 32lacs records, so it exactly creates 32 threads and processing the same set of records but the processing time for each of these threads is different all the time inspite of the same records and same server configuration. The difference is more than an hour.
Could you guess the reason for this also. Processing time for each thread is depending on what all factors?
That depends. I do not know about Lucene, but generally if your threads execute a cpu-intensive task, you will have a lot of context swiching and the performance will degrade, if you create a number of threads that is far more than the cpu cores provided by your system. If your threads perform lot of I/O operation the performance can be better since most of the time threads will be blocked on I/O.
By 'processing time' you mean, the time difference between the start & end of a thread?!
If so, it depends on the priority of threads. If you have set no priority for any of the threads then its up to the OS to do the time-slicing for the threads. It may choose to a round-robin on threads 2,5,18,22 & 10 and thereby finishing only 4 threads. Now, lets say thread 10 has been started but is on-hold now. The OS may decide to move to the rest of the threads. In this case, it may happen that thread 10 finishes last giving a total time of an hour or so.
You would not want to do an unequal compare on thread timings as it completely OS dependent & ofcourse you dont know the logic/algorithm.
If 'processing time' is actually the time taken by a thread to complete its job - the running time of a thread(you'll have to ignore the time when the thread waits), we can really do a compare. Again we cannot determine this too, as we dont know when our thread will actually be in running state & in runnable state.
By processing time i meant, time difference between start and end of thread.
Issue here is each time i am running that rule engine (same server, same configuration, same # of records, same # of threads created) but every time the time taken is different. And believe me its huge difference.
1st run - took 81 mins to complete the whole job with 32 threads
2nd run - took 147.0 mins to complete the same job
3rd run - took 117.0 mins to complete the same job
4th run - took 160.0 mins to complete the same job
All the internal/external factors are same, still such a huge time difference in completing the same job in each run. I dont know, how to trubbleshoot this problem. Where should i exactly look for the problem - code, server configuration, CPU?
Each thread is processing approx 1 lac records...processing meaing with the help of Lucene, it does indexing and searching of the records..There are 32 lacs records and every 1 lac records, it creates 1 thread.
If all Threads are reading from a disk file, seems to me there will be a lot of disk thrashing as each Thread tries to get to a different file. Random variation in the disk reading sequences would account for the variation in time.
Do you read and process a single record or buffer in multiple records?
When I did something like this I had one Thread responsible for filling data buffers from disk while other Threads manipulated the data and wrote the results. Thus only one Thread was reading the disk.