• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

How to perform optimal massive insert operation

 
Greenhorn
Posts: 26
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi experts..
sorry if i am asking a question that has been answered scores'f times!

i target to upload as many as 1million records into an oracle table. the table has 6 columns (if this has got something to impact optimization)
and alas!..it doesn't have a primary key or indexing..the table is truncated for every new process (use all data and then truncate..its used for staging)

i've tried oracle hint /+append as well but to no noticeable improvement.

may i please have some hint on how to perform this insert as fast as possible..the transaction is managed by container (im using Struts..and EJB as persistence provider) so i've very limited time band to accomodate this transaction in..

..don't know if i've framed the question right..
thanks a lot in advance!

....ideas...
 
author & internet detective
Posts: 41860
908
Eclipse IDE VI Editor Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Does it have to be in Java? If you can use SQL Loader, it would likely be noticeably faster.
 
Ranch Hand
Posts: 133
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi!

Is it possible for you to use a database import tool? Something like DataPump for Oracle (or in former times sql*loader, import,etc.).
For such a tool inserting a million rows in a table with just six columns without any constraints and indexes should take some seconds - unless your six columns contain large CLOBs/BLOBs.

I think EJB doesn't make too much sense for such a massive data manipulation.
You surely don't want to create one million objects just to persist them to the DB and then "forget" them...


By the way (off topic): is it possible to have entity beans for a table without a PK? How does a merge or a delete work for those entities?
Does it compose a complex PK consisting of all columns in the background...?


John

[edit] ah, someone was faster than me..
 
shwetank singh
Greenhorn
Posts: 26
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks Jeanne and John!!

I regret for not having explained the problem from root. Yes Jeanne it is very logical for you to suggest the SQL loader, and John to mark for EJB's
The application actually has a jsp where-in a user has provision to upload data in .txt format. The file generally contains half to one million records and data in every row (to be mappe dto db columns) is separated by comma.

Why EJB? the client and service may not be collocated in same JVM so i made a stateless session bean managed by container. the remote interface is called by client and file is first broken against a tokenizer and saved into value objects that in turn are saved into an arraylist. i then send over the arraylist through beans to process the data in a DAO pattern. The DAO class connects to database and tries to insert data into a table.

i was supposed to invoke some other business logic over this data then, and formulate the final result. Now the problem....
i have detached all further business logic and have written it into a database procedure (that i call using CallableStatement and do all heavy data processing in database server instead of app server going through this thrashing. That helped me save a lot but one thing...if i can somehow optimize the massive insert i have to do in DAO..

perhaps, i might not be able so see a solution 'coz'f lack of my knowledge..hope i've made the situation as clear as possible.

i doubt if JPQL will be able to handle it any better, perhaps the design im using is based on native queries and its EJB 2, so i dont have priviledge of using @Entity and merge / persist operations
Thanks a ton!
 
Jeanne Boyarsky
author & internet detective
Posts: 41860
908
Eclipse IDE VI Editor Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
How long does the million records take via EJB? Do you have a time limit requirement? I'm trying to understand your end goal here.
 
shwetank singh
Greenhorn
Posts: 26
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
we have a HTTP time-out set to 10minutes. The container transactions have however been set to 5 as the application is under high traffic

 
Jeanne Boyarsky
author & internet detective
Posts: 41860
908
Eclipse IDE VI Editor Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

shwetank singh wrote:we have a HTTP time-out set to 10minutes. The container transactions have however been set to 5 as the application is under high traffic


There's a few numbers here:
10 minutes - HTTP time out - how long your server has to respond with something
5 minutes - container transaction (you aren't going to insert a million rows in 5 minutes)

The question I had was how long it is acceptable to take to insert all of those records. You can do the insert asynchronously and return a "processing - please wait" message immediately. Which means you aren't limited to 5/10 minutes. You could even return "processing - please check back later" if it's going to take hours.
 
shwetank singh
Greenhorn
Posts: 26
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
thanks
i actually can't decide upon server time and bean stay alive times..something beyond my control and role.
..i already thought of both the messages you have suggested..the first one is nothing but a progress bar that i already have...the other one is nothing but saying that we have a report for your transaction when someone checks back after an hour...
both are helpful to some extent but are not solutions to my basic question...how to do massive insert..
thanks yet again..
 
Rancher
Posts: 4803
7
Mac OS X VI Editor Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

shwetank singh wrote: to upload as many as 1million records into an oracle table.


As others have said, don't do that. Don't expect JDBC to give you the performance you want. Use a better tool.
Within Java you can use the System.exec() to do anything. Or write the file to a special directory, and have a cron job or other constant script look for the file, move it to another working directory, and do the load on it there.

Sometimes the best way to do something is to change the whole problem.
 
shwetank singh
Greenhorn
Posts: 26
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks Pat!

i got it...here's how..i read through the file and created a copy of file into an app server directory (this saves me the cost of reading through the file and creating innumerous objects)..i then pass this file reference to remote bean and read the file directly from a local call..this saves me a lot of network cost.

seond..to insert these many number of rows..i combined all ideas the experts gave ..i put an upperbound on file size and used prepared statement batch execute..this saved me as much as 5 minutes in processing the whole logic..

thanks to all!!!...
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic