| Author |
How to insert to MySql efficiently?
|
Susan Smith
Ranch Hand
Joined: Oct 13, 2007
Posts: 223
|
|
I'm trying to insert data into an empty table. At the end, it will have 60 million+ records. This is the pseudocode of what I do currently: It's taking 3.5 - 4 hours. I'm trying to cut down the time as much as possible. Then I read this article: http://dev.mysql.com/doc/refman/5.1/en/insert-speed.html It says that:
If you are inserting many rows from the same client at the same time, use INSERT statements with multiple VALUES lists to insert several rows at a time.
I'm thinking it must be something like this: What I'm also wondering is whether addBatch mechanism that I do is already like what is mentioned in the article? Can somebody please advise? Thanks in advance for all your help.
|
 |
Campbell Ritchie
Sheriff
Joined: Oct 13, 2005
Posts: 32651
|
|
I tried that sort of thing, but more complicated, more as a test of performance than anything else, and I got nearly 60,000,000 records in. Took about a week . . . So I don't think you need worry about 3� hours. You can't tell from the API how that method is implemented; you could try getting whichever class implements Statement and see whether that tells you. Statement and PreparedStatement are actually interfaces, so they don't have actual implementations.
|
 |
Scott Selikoff
Saloon Keeper
Joined: Oct 23, 2005
Posts: 3652
|
|
|
60 million? How often do you perform the insert? I'd recommend putting everything in a SQL file and inserting it directly into the database. That number of records never works that well via JDBC even with batch updates.
|
My Blog: Down Home Country Coding with Scott Selikoff
|
 |
Susan Smith
Ranch Hand
Joined: Oct 13, 2007
Posts: 223
|
|
|
It's a one time insert for the 60+ million. But in other parts in my project, I have to do insert/ update in millions too (< 10 millions approximately), that's why I'm looking for a really really efficient way to make things run more smoothly.
|
 |
Paul Clapham
Bartender
Joined: Oct 14, 2005
Posts: 16483
|
|
|
If you're looking for an efficient method, and you have more than one method to choose from, then I would advise setting up some fairly large test data sets and running them through the different methods. Measure whatever counts as "efficiency" (in your case it sounds like it's wall-clock time) and see which method wins.
|
 |
 |
|
|
subject: How to insert to MySql efficiently?
|
|
|