aspose file tools*
The moose likes JDBC and the fly likes How to perform optimal massive insert operation Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Databases » JDBC
Bookmark "How to perform optimal massive insert operation" Watch "How to perform optimal massive insert operation" New topic
Author

How to perform optimal massive insert operation

shwetank singh
Greenhorn

Joined: Apr 02, 2007
Posts: 26
Hi experts..
sorry if i am asking a question that has been answered scores'f times!

i target to upload as many as 1million records into an oracle table. the table has 6 columns (if this has got something to impact optimization)
and alas!..it doesn't have a primary key or indexing..the table is truncated for every new process (use all data and then truncate..its used for staging)

i've tried oracle hint /+append as well but to no noticeable improvement.

may i please have some hint on how to perform this insert as fast as possible..the transaction is managed by container (im using Struts..and EJB as persistence provider) so i've very limited time band to accomodate this transaction in..

..don't know if i've framed the question right..
thanks a lot in advance!

....ideas...
Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 30548
    
152

Does it have to be in Java? If you can use SQL Loader, it would likely be noticeably faster.


[Blog] [JavaRanch FAQ] [How To Ask Questions The Smart Way] [Book Promos]
Blogging on Certs: SCEA Part 1, Part 2 & 3, Core Spring 3, OCAJP, OCPJP beta, TOGAF part 1 and part 2
John Bengler
Ranch Hand

Joined: Feb 12, 2009
Posts: 133
Hi!

Is it possible for you to use a database import tool? Something like DataPump for Oracle (or in former times sql*loader, import,etc.).
For such a tool inserting a million rows in a table with just six columns without any constraints and indexes should take some seconds - unless your six columns contain large CLOBs/BLOBs.

I think EJB doesn't make too much sense for such a massive data manipulation.
You surely don't want to create one million objects just to persist them to the DB and then "forget" them...


By the way (off topic): is it possible to have entity beans for a table without a PK? How does a merge or a delete work for those entities?
Does it compose a complex PK consisting of all columns in the background...?


John

[edit] ah, someone was faster than me..
shwetank singh
Greenhorn

Joined: Apr 02, 2007
Posts: 26
Thanks Jeanne and John!!

I regret for not having explained the problem from root. Yes Jeanne it is very logical for you to suggest the SQL loader, and John to mark for EJB's
The application actually has a jsp where-in a user has provision to upload data in .txt format. The file generally contains half to one million records and data in every row (to be mappe dto db columns) is separated by comma.

Why EJB? the client and service may not be collocated in same JVM so i made a stateless session bean managed by container. the remote interface is called by client and file is first broken against a tokenizer and saved into value objects that in turn are saved into an arraylist. i then send over the arraylist through beans to process the data in a DAO pattern. The DAO class connects to database and tries to insert data into a table.

i was supposed to invoke some other business logic over this data then, and formulate the final result. Now the problem....
i have detached all further business logic and have written it into a database procedure (that i call using CallableStatement and do all heavy data processing in database server instead of app server going through this thrashing. That helped me save a lot but one thing...if i can somehow optimize the massive insert i have to do in DAO..

perhaps, i might not be able so see a solution 'coz'f lack of my knowledge..hope i've made the situation as clear as possible.

i doubt if JPQL will be able to handle it any better, perhaps the design im using is based on native queries and its EJB 2, so i dont have priviledge of using @Entity and merge / persist operations
Thanks a ton!
Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 30548
    
152

How long does the million records take via EJB? Do you have a time limit requirement? I'm trying to understand your end goal here.
shwetank singh
Greenhorn

Joined: Apr 02, 2007
Posts: 26
we have a HTTP time-out set to 10minutes. The container transactions have however been set to 5 as the application is under high traffic

Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 30548
    
152

shwetank singh wrote:we have a HTTP time-out set to 10minutes. The container transactions have however been set to 5 as the application is under high traffic

There's a few numbers here:
10 minutes - HTTP time out - how long your server has to respond with something
5 minutes - container transaction (you aren't going to insert a million rows in 5 minutes)

The question I had was how long it is acceptable to take to insert all of those records. You can do the insert asynchronously and return a "processing - please wait" message immediately. Which means you aren't limited to 5/10 minutes. You could even return "processing - please check back later" if it's going to take hours.
shwetank singh
Greenhorn

Joined: Apr 02, 2007
Posts: 26
thanks
i actually can't decide upon server time and bean stay alive times..something beyond my control and role.
..i already thought of both the messages you have suggested..the first one is nothing but a progress bar that i already have...the other one is nothing but saying that we have a report for your transaction when someone checks back after an hour...
both are helpful to some extent but are not solutions to my basic question...how to do massive insert..
thanks yet again..
Pat Farrell
Rancher

Joined: Aug 11, 2007
Posts: 4659
    
    5

shwetank singh wrote: to upload as many as 1million records into an oracle table.

As others have said, don't do that. Don't expect JDBC to give you the performance you want. Use a better tool.
Within Java you can use the System.exec() to do anything. Or write the file to a special directory, and have a cron job or other constant script look for the file, move it to another working directory, and do the load on it there.

Sometimes the best way to do something is to change the whole problem.
shwetank singh
Greenhorn

Joined: Apr 02, 2007
Posts: 26
Thanks Pat!

i got it...here's how..i read through the file and created a copy of file into an app server directory (this saves me the cost of reading through the file and creating innumerous objects)..i then pass this file reference to remote bean and read the file directly from a local call..this saves me a lot of network cost.

seond..to insert these many number of rows..i combined all ideas the experts gave ..i put an upperbound on file size and used prepared statement batch execute..this saved me as much as 5 minutes in processing the whole logic..

thanks to all!!!...
 
wood burning stoves
 
subject: How to perform optimal massive insert operation