This week's book giveaway is in the Cloud/Virtualizaton forum.
We're giving away four copies of Mesos in Action and have Roger Ignazio on-line!
See this thread for details.
Win a copy of Mesos in Action this week in the Cloud/Virtualizaton forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

How to insert to MySql efficiently?

 
Susan Smith
Ranch Hand
Posts: 224
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm trying to insert data into an empty table. At the end, it will have 60 million+ records.

This is the pseudocode of what I do currently:


It's taking 3.5 - 4 hours. I'm trying to cut down the time as much as possible.

Then I read this article: http://dev.mysql.com/doc/refman/5.1/en/insert-speed.html
It says that:
If you are inserting many rows from the same client at the same time, use INSERT statements with multiple VALUES lists to insert several rows at a time.


I'm thinking it must be something like this:


What I'm also wondering is whether addBatch mechanism that I do is already like what is mentioned in the article? Can somebody please advise?

Thanks in advance for all your help.
 
Campbell Ritchie
Sheriff
Pie
Posts: 48968
60
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I tried that sort of thing, but more complicated, more as a test of performance than anything else, and I got nearly 60,000,000 records in. Took about a week . . .

So I don't think you need worry about 3� hours.

You can't tell from the API how that method is implemented; you could try getting whichever class implements Statement and see whether that tells you. Statement and PreparedStatement are actually interfaces, so they don't have actual implementations.
 
Scott Selikoff
author
Saloon Keeper
Posts: 4014
18
Eclipse IDE Flex Google Web Toolkit
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
60 million? How often do you perform the insert? I'd recommend putting everything in a SQL file and inserting it directly into the database. That number of records never works that well via JDBC even with batch updates.
 
Susan Smith
Ranch Hand
Posts: 224
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It's a one time insert for the 60+ million. But in other parts in my project, I have to do insert/ update in millions too (< 10 millions approximately), that's why I'm looking for a really really efficient way to make things run more smoothly.
 
Paul Clapham
Sheriff
Posts: 21107
32
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you're looking for an efficient method, and you have more than one method to choose from, then I would advise setting up some fairly large test data sets and running them through the different methods. Measure whatever counts as "efficiency" (in your case it sounds like it's wall-clock time) and see which method wins.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic