aspose file tools*
The moose likes JDBC and the fly likes Size of a Database transaction - Number of records Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Databases » JDBC
Bookmark "Size of a Database transaction - Number of records" Watch "Size of a Database transaction - Number of records" New topic
Author

Size of a Database transaction - Number of records

ap sharma
Greenhorn

Joined: Aug 11, 2010
Posts: 6
Hi,

I have to insert approximately 96,000 rows in a table based on certain calculations. The business requirements is to have this operation in a single transaction, i.e. either all the entries are made successful or none. I am using MySQL and I could see the operation takes much more time than expected.. I shut down the operation after 4 hours and by that time only 10000 rows have been inserted. I'mlooking for a better solution.

Can anyone throw some light on better ways of handling this?

Thanks in advance,
Paul Sturrock
Bartender

Joined: Apr 14, 2004
Posts: 10336

96,000 rows in one transaction is going to require a lot of work on the part of the database to track what it needs to undo. It is fairly unusual to use transactions with a batch of any notable size. You can probably beef up your database and the machine it's own to speed this up, but ultimately this is likely to be slow. How about handling the transaction programatically? You could insert these rows without using a single transaction then, if all are not successful, delete manually. You could track the partial state of the data with some sort of status attribute (i.e. have an "unverified" or whatever attribute and change other business logic to ignore unverified data).


JavaRanch FAQ HowToAskQuestionsOnJavaRanch
Scott Selikoff
Saloon Keeper

Joined: Oct 23, 2005
Posts: 3704
    
    5

Also, you should disable indexes during the insertion. Active index building will slow down that kind of volume loading.


My Blog: Down Home Country Coding with Scott Selikoff
ap sharma
Greenhorn

Joined: Aug 11, 2010
Posts: 6
@Paul: What do you mean by "It is fairly unusual to use transactions with a batch of any notable size"?? Let me elaborate on 96K rows that I'm talking about. Each row is made up of 4 columns (2 Varchar and 2 Number), there is an Int PK in table for each row.
steve souza
Ranch Hand

Joined: Jun 26, 2002
Posts: 861
96,000 records really isn't that much. I don't think it should take anywhere close to 4 hours. I haven't done them in a transaction, but I have inserted about 1 million records on my local mysql database and it took seconds, not minutes, let alone hours. Just try to insert those same records via a mysql gui, and not through your code and see how long it takes.


http://www.jamonapi.com/ - a fast, free open source performance tuning api.
JavaRanch Performance FAQ
Scott Selikoff
Saloon Keeper

Joined: Oct 23, 2005
Posts: 3704
    
    5

1 million records in seconds? Really? As part of a single, rollback-enabled transaction?
Paul Sturrock
Bartender

Joined: Apr 14, 2004
Posts: 10336

steve souza wrote:96,000 records really isn't that much.


Yeah in the big scheme of things its not a massive amount of data, but it is not trivial if you do it in one transaction (though I agree, 4 hours seems like a very long time even for this size of transaction).
Matthew Brown
Bartender

Joined: Apr 06, 2010
Posts: 4392
    
    8

I'd be tempted to load the records into a temporary table (no transactions needed - just make sure the number of records is right at the end). Then you could copy them over to where you want them with a single INSERT INTO...SELECT statement. I'm not sure what the performance difference would be, but it wouldn't surprise me if it was a lot better.
Bill Karwin
author
Ranch Hand

Joined: Aug 02, 2010
Posts: 32
Here are some tips for speeding up bulk loading in MySQL:

You should use LOAD DATA INFILE to bulk-load a large quantity of records quickly. This usually runs 10 or 20 times faster than using INSERT.
See http://dev.mysql.com/doc/refman/5.1/en/load-data.html

If you do use INSERTs, don't use autocommit. The overhead of starting and committing a transaction for each INSERT is really expensive. Learn to use explicit BEGIN TRANSACTION and COMMIT statements. I usually commit after inserting every 1000 rows, just in case resources associated with uncommitted data grow.

You can also reduce overhead by using multi-row INSERT syntax:

Disable foreign key checks and unique checks (remember to re-enable afterwards!):

Disable keys (i.e. indexes) during a restore. This is how you do it:

If you use MyISAM tables, make sure your key_buffer_size is set as large as possible. The default is 8MB, and the max is 4GB. I'd try 1GB.

If you use InnoDB tables, make sure your innodb_buffer_pool_size is set as large as possible . The default is 8MB, and the max is 4GB. I'd try 1GB.

Set innodb_flush_log_at_trx_commit = 2 during the data-load if you use InnoDB tables (but understand that this setting is global and may reduce slightly the ability of the database to guarantee writes).




Bill Karwin is the author of SQL Antipatterns: Avoiding the Pitfalls of Database Programming
Gian Franco
blacksmith
Ranch Hand

Joined: Dec 16, 2003
Posts: 977
ap sharma wrote:I have to insert approximately 96,000 rows in a table based on certain calculations.


...do the calculations have any impact on the low performance?


"Eppur si muove!"
steve souza
Ranch Hand

Joined: Jun 26, 2002
Posts: 861
I just tested loading about 20,000 records into a mysql table that has 3 columns of ints. I was surprised at how shockingly slow it was! I had previously done a 'load' of about a million records which was quite fast. The insert command however took minutes to load a couple thousand rows, before I gave up and killed it. This was whether I used a transaction or not. That seems awful slow for any database. In the past (on Sybase ASE) I have even seen large transactions perform better than individual insert statements because it doesn't have to send status back to the client after each insert completes. I am not a mysql expert, so I really don't know what to say. Just wanted to correct my previous statement.

For anyone that wants to try here is the table I used. I inserted the values 1,2,3 in each column.

 
 
subject: Size of a Database transaction - Number of records