Hey everyone,
I have a quick question about how to design a batch process to read data from files and insert into a database with
Java.
In a nutshell, I am going to receive numerous CSV files every day from different organisations. All files are of the same format and contain people, cases and alerts. A person can be in zero to many cases. And an alert is always associated to a case. Each file can be in any order with people, cases and alerts spread throughout the file.
My knee
jerk solution to this problem is to read each line of the file into memory sorting the people, cases and alerts into separate Hashmaps as I go. I can then process the people, cases and alerts in sequence rejecting any records that may be invalid.
However, with this solution if the file is big (i.e. 100,000 lines) I will have memory issues if I cache everything. Although given that I can't rely on the structure of the file I can't think of any better way to do this.
Any ideas?
Thanks,
Chris