Win a copy of Think Java: How to Think Like a Computer Scientist this week in the Java in General forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

updating the records at a time

 
pandu ranga
Greenhorn
Posts: 23
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi

the scenario is i have a big xml file which is having millions of data.I have parsed it and got the elements of it.Now i want to insert the data in the data base.What is the best thing i can do from following and reason for doing so?

1) insert data one by one
2) insert data in batches
3) insert data all at once.

thanks in advance.
 
Nitesh Kant
Bartender
Posts: 1638
IntelliJ IDE Java MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by pandu ranga:

3) insert data all at once.


What is the meaning of the 3rd option?
If the number of records is huge then you can use batch updates. It will perform better than inserting data one by one.
[ March 07, 2008: Message edited by: Nitesh Kant ]
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Many databases actually come with a tool to import data from XML. You might want to take a look at that possibility.
 
Eric Shao
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
the second method will reach better performance ,but you must have a fast way to paser big xml file
 
pandu ranga
Greenhorn
Posts: 23
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thats fine but why adding the records in batches adds to performance? Any particular reason for this??
 
steve souza
Ranch Hand
Posts: 862
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
<<thats fine but why adding the records in batches adds to performance? Any particular reason for this?? >>
There is probably no one answer for all backends and jdbc drivers. I have tested this with sybase as a backend. In general with batches there is less work and less network IO.

For example if you commit (even if the commit is implicit) after each row the server must do its overhead for committing and communicate this information back to the client once for each row. If this work is done once per batch vs once per row it is less work on server, client, and network. Performance differences can be significant.

I would suggest you do a test in your environment and measure to see if it makes a difference for you. Post the results too.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic