Hi all, the functionality that i am trying to implement is: 1. Read file 2. validate each record(line). 3. store record to DB.
I want record processing should happen in parallel. What I meant by this is, read a file by thread A, handle the line(record) to sub thread to validate and store in DB), while sub thread busy doing validation and storing, thread A continues to read file. Basically what I don't want to happend is, read record, validate record and store record in sequential pattern.
My initial sketch is something like below; 1. Create a Pool of threads 2. Create Jobqueue.
As the main thread start reading file, every record that fetches will put into, as and when records are available in the queue the second part of the process should get record from queue and validate it and store them , then pick next available record and continure untill it queue is empty.
Is this a right way of doing this, OR is there any better way of doing this? If so, can some one here please suggest me. any tools / opensource that has functionality of this kind is also welcome. Bit of code snippet to get started with will be much appreciated.
Originally posted by JH Harrison: Thanks Henry, We are using jdk.1.4.2 and not jdk5, hence looking for good libraris / API. Can you guide me with little code snippet to get start, I am getting confused.
Thanks J Harrison.
The concurrent libraries from Java 5 is basically a port of the library that was developed by Doug Lea. This library runs in earlier platforms, and is available here.
Henry [ July 06, 2006: Message edited by: Henry Wong ]
This is an interesting problem and one that might have very surprising results. You might consider two thread pools so you can observe queue sizes and concurrent threads.
main thread reads a record, creates a massage task, puts it in queue
process thread gets a massage task, massages data a bit, creates an insert task, puts it in queue
update thread gets an insert task, executes SQL against the database
I'd be interested to see if the update threads keep up, or if database locks are roughly equivalent to synchronizing the update method. The update thread might get up to "n" tasks and build a batch update statement and then fire it off. It would also have to know when we're all done to fire the last partial batch and commit.
Let us know how this runs!
A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Joined: Jul 05, 2006
Hello, Thanks for your help and suggestion. I downloaded backport-util-concurrent for jdk1.4 and trying some API. What I did so far is: 1. parse a file. 2. submit each line to sub task for further processing. Below is the code snippet which is doing this.
Just wanted some feedback to know that I am in the right direction, I never used these API's before and I never worked with threading before.
Also I wanted to know that, how do I terminate or empty the queue if there are any problem/exception in either validation process? The reason behind asking for this is, our requirement says that if there are any error(s) in the validation file processing should stop and rollback all the db contents for that particualr file.
Is this achievable. if so, can you guys please suggest me!
subject: processing huge file in multithreaded env