File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Object Relational Mapping and the fly likes bulk insert Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Murach's Java Servlets and JSP this week in the Servlets forum!
JavaRanch » Java Forums » Databases » Object Relational Mapping
Bookmark "bulk insert" Watch "bulk insert" New topic
Author

bulk insert

Neeraj Vij
Ranch Hand

Joined: Nov 25, 2003
Posts: 315
Hi,


I need to insert large number of records after reading a xml into the database. is it good to go for hibernate in case of bulk insert/update operations or jdbc will be better ?

please provide some inputs.

Thanks,
Neeraj.
David O'Meara
Rancher

Joined: Mar 06, 2001
Posts: 13459

It depends. If it was a one-off run or something that ran on huge amounts of data (running several minutes or more) I would look at raw JDBC. If it was part of a larger application (particularly an already using hibernate) I'd look at Hibernate.
Martijn Verburg
author
Bartender

Joined: Jun 24, 2003
Posts: 3274
    
    5

Incidentally Hibernate does have ways of performing bulk operations if you are 'forced' or 'stuck' down that technology route.


Cheers, Martijn - Blog,
Twitter, PCGen, Ikasan, My The Well-Grounded Java Developer book!,
My start-up.
Neeraj Vij
Ranch Hand

Joined: Nov 25, 2003
Posts: 315
Hi,
Thanks for inputs. We need to insert lot of data into 6-7 tables which has got some relationships between them.

We are still evaluating the options. I read few articles where jdbc batch update is even faster than using pl/sql code invked from a java applications.

I was thinking about hibernate only because it would have taken care of the relationships and inserted the data in correct order. But I have a high priority requirement of high performance. So I got inclined towards JDBC which is faster for inserts/updates for batch updates which is not directly supported by hibernate.

Any more inputs.

Thanks
Neeraj.
Arun Kumarr
Ranch Hand

Joined: May 16, 2005
Posts: 513

however if you are trying to insert multiple rows into multiple tables and if you are using set/list/bag or any such collections, you need to take care of the classical n+1 table inserts issue.
I however would prefer direct JDBC inserts and updates when it comes to handling with huge data.


If you are not laughing at yourself, then you just didn't get the joke.
Paul Sturrock
Bartender

Joined: Apr 14, 2004
Posts: 10336


We are still evaluating the options. I read few articles where jdbc batch update is even faster than using pl/sql code invked from a java applications.

I was thinking about hibernate only because it would have taken care of the relationships and inserted the data in correct order. But I have a high priority requirement of high performance. So I got inclined towards JDBC which is faster for inserts/updates for batch updates which is not directly supported by hibernate.

If your only driver is performance, I would suggest looking at the bulk data loading tools that come with your database. These will allow you to influence performance features of the database itself that will be difficult to do from JDBC.


JavaRanch FAQ HowToAskQuestionsOnJavaRanch
Neeraj Vij
Ranch Hand

Joined: Nov 25, 2003
Posts: 315
thanks. I need to insert data by reading a xml into sql server 2005. Single xml will have multiple records which needs to be inserted in 8-9 tables. Performance is the major driver for us.

please suggest option.

Thanks,
Neeraj.
Martijn Verburg
author
Bartender

Joined: Jun 24, 2003
Posts: 3274
    
    5

It depends on what your XML content is and how the different parts relate to each other. Are the 7-8 tables related to each other?
Neeraj Vij
Ranch Hand

Joined: Nov 25, 2003
Posts: 315
yes .. those tables are related.
Martijn Verburg
author
Bartender

Joined: Jun 24, 2003
Posts: 3274
    
    5

Since you have foreign key relationships between several tables you're gonig to have some difficulties creating a highly performant and highly 'correct' solution.

A way you could do it would roughly follow these steps

1.) Pull the records out of the XML (using Stylesheets or Java XML libs).
2.) Create the insert statements for the records (I'd use stylesheets for this if possible).
3.) Bulk insert the records into the database (fast performance).
4.) Switch on referential integrity and see what breaks (accuracy of data).

The main issue with the above approach is that you may have to manually fix an awful lot of relationships, so it would be a good idea to analyse your data beforehand.

Hope that helps!
Neeraj Vij
Ranch Hand

Joined: Nov 25, 2003
Posts: 315
Hello,

Thanks for your inputs.

I was also working on the same approach mentioned by you. But again I am stuck up in taking a decision between "Batch Insert" or "switching off the refrential constraints".

I am thinking on compromising some peformance as compared to duplicate effort to resolve data issues. which will require manual checking and time.

wats your opinion.

Thanks,
Neeraj.
Martijn Verburg
author
Bartender

Joined: Jun 24, 2003
Posts: 3274
    
    5

How important is the performance really?

If this is this a one off data migration then I suggest you leave the referential integrity on before you put the data in (correct data is more important than performance in this case).

If it's a regular occurrence or if you are under time frame difficulties (for the actual insert itself) then it really depends on how good your data is, I think all you can do is do a little bit of data analysis beforehand and try to get a sense of how good it is. If it's very good then you can bulk insert and deal with the manual fixes afterwards, if it's not so good then I'd get it fixed before bulk inserting.

Data migration/bulk inserting like this is never an easy exercise, good luck!
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: bulk insert
 
Similar Threads
Bad Performance for JDBC Insert
Million record upload and diplay with pagination
Bulk Insert in JPA 2
Scrabble... it only takes a minute...
EJB3 -> Best optimizzed way for bulk insert ?