• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

How to insert to MySql efficiently?

 
Ranch Hand
Posts: 224
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'm trying to insert data into an empty table. At the end, it will have 60 million+ records.

This is the pseudocode of what I do currently:


It's taking 3.5 - 4 hours. I'm trying to cut down the time as much as possible.

Then I read this article: http://dev.mysql.com/doc/refman/5.1/en/insert-speed.html
It says that:

If you are inserting many rows from the same client at the same time, use INSERT statements with multiple VALUES lists to insert several rows at a time.



I'm thinking it must be something like this:


What I'm also wondering is whether addBatch mechanism that I do is already like what is mentioned in the article? Can somebody please advise?

Thanks in advance for all your help.
 
Marshal
Posts: 79239
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I tried that sort of thing, but more complicated, more as a test of performance than anything else, and I got nearly 60,000,000 records in. Took about a week . . .

So I don't think you need worry about 3� hours.

You can't tell from the API how that method is implemented; you could try getting whichever class implements Statement and see whether that tells you. Statement and PreparedStatement are actually interfaces, so they don't have actual implementations.
 
author
Posts: 4335
39
jQuery Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
60 million? How often do you perform the insert? I'd recommend putting everything in a SQL file and inserting it directly into the database. That number of records never works that well via JDBC even with batch updates.
 
Susan Smith
Ranch Hand
Posts: 224
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It's a one time insert for the 60+ million. But in other parts in my project, I have to do insert/ update in millions too (< 10 millions approximately), that's why I'm looking for a really really efficient way to make things run more smoothly.
 
Marshal
Posts: 28226
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If you're looking for an efficient method, and you have more than one method to choose from, then I would advise setting up some fairly large test data sets and running them through the different methods. Measure whatever counts as "efficiency" (in your case it sounds like it's wall-clock time) and see which method wins.
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic