We need to develop a functionality where user can upload a file, then file gets parsed and saved into database. Here is how I think it should be implemented and I need feedback on this.
We are going to implement it in asynchronous way.
User will hand over (upload) the file to server.
There would be a daemon process running at specific interval say half an hour. It would parse the file and save it in DB.
Now, let's consider that there would be 100 users and they upload the files. Now, if we implement FIFO algorithm , then last user would have starvation. To avoid that we are thinking about having multiple schedulers which will run at the same time and they will divide the files and parsing would be done in parallel.
Is this OK implementation? The only critical question is how can we make sure two threads/schedulers do not parse the same file? i.e managing task allocation.
How do you make sure that two threads don't parse the same file? You don't let those threads choose the files they are going to parse. Have a controller thread which selects the files from wherever they get uploaded to, and have it distribute those files to however many parser threads you choose to create.