This week's book giveaway is in the Design forum.
We're giving away four copies of Building Microservices and have Sam Newman on-line!
See this thread for details.
The moose likes Design and the fly likes Design for batch load application Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login

Win a copy of Building Microservices this week in the Design forum!
JavaRanch » Java Forums » Engineering » Design
Bookmark "Design for batch load application" Watch "Design for batch load application" New topic

Design for batch load application

Ravi Bansal
Ranch Hand

Joined: Aug 18, 2008
Posts: 86
I am developing an application in Java. I have to upload data from flat text files into a DB tables. Problem is, as the application is used by several vendors who have different systems, which can generate different flat text files. For example Vendor V1 may generate a text file where certain field starts at column 10 and another Vendor V2 may generate a text file where same filed may start at column 12. So In a way I have different parsing rules for different vendors.

What I want to achieve is to create an application that loads “rules” from some specific place(may be some configurable file). These rules will be applied on flat text file to generate the data.

Afterwards there will be some transformations applied on that data ,before saving it to DB. Those transformations will again be vendor specific . For example vendor V1 may divide certain field with 10 before sending it to DB , while Vendor V2 may divide the same field 20 before saving it .

How can I achieve this? What is the best practice to design this solution ,you recommend?

I was thinking if I could create a MDB for this .

Use JPA for saving the data in DB (not sure if this is a gud idea ,as there could be 50-60k records in one text file)

But I am not sure what should I use to have parsing and transforming rules configurable .

Any Suggestions will be appreciated.
Claude Moore
Ranch Hand

Joined: Jun 24, 2005
Posts: 680

I think that you have at least two different problems:

-define an external record format for each line of your text files (process stage 0)

-define a transformer for each file after stage 0.

For the first process, you can easily put format specifications in an XML file which entry like this:

You may want to read that xml "driver" file, process each line of your text file accordingly, and put them in a list of HasMap.

For the transformation stage, well, you may want to express transformation rules in an xml as well, but you should then be able
to write an interpreter for such rules. Will it be worth ? If you have a limited number of vendors, you may use a Factory to instantiate
the concrete transformer and process read data.

About inserting data on DBMS: well, that's just machinery :-)
You can use simply JDBC if you prefer, without having to use JPA (expecially if you have to deal with hashmap of values).

Hope this may help.
Roberto Perillo

Joined: Dec 28, 2007
Posts: 2270

I think there are some ways to solve your problem. One thing that you can do is define an interface with a parse method, which can receive an InputStream object that represents the content to be parsed. Then, you can somehow map each parser with something that identifies the parser to be used. For instance, in the project that I'm about to finish right now, we deal with several types of files, and each has a specific format, so for each format there is one parser. There are some characteristics that identify the parser to be used: the file extension and the header of the file. CSV files are always parsed by the same parser. The same for XML files and DAT files. TXT files can be parsed by 6 other parsers. In this case, the header identifies the parser to be used. We have class called HeaderScanner that has a Map<ReportFileType, Parser> which is where we map file types and parsers. And since we use Spring, this map is defined in an XML file, in terms of beans, and injected into the HeaderScanner class.

What I want to achieve is to create an application that loads “rules” from some specific place(may be some configurable file). These rules will be applied on flat text file to generate the data.

Hum... can you show us an example of how that could be?

Cheers, Bob "John Lennon" Perillo
SCJP, SCWCD, SCJD, SCBCD - Daileon: A Tool for Enabling Domain Annotations
I’ve looked at a lot of different solutions, and in my humble opinion Aspose is the way to go. Here’s the link:
subject: Design for batch load application
It's not a secret anymore!