wood burning stoves 2.0*
The moose likes Performance and the fly likes Migration in same DB with best performance Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Performance
Bookmark "Migration in same DB with best performance" Watch "Migration in same DB with best performance" New topic
Author

Migration in same DB with best performance

Kaustubh G Sharma
Ranch Hand

Joined: May 13, 2010
Posts: 1270

I need to create a app, where I have to transfer a million record from one table to another on a click of a button, in a same Data Base.What will be the best approach, for getting best performance?


No Kaustubh No Fun, Know Kaustubh Know Fun..
fred rosenberger
lowercase baba
Bartender

Joined: Oct 02, 2003
Posts: 11161
    
  16

umm...My first thought (and I am by NO means an expert) is WHY?

On the surface, this sounds like a bad idea. What purpose does copying the records from one table to another in the same database serve?


There are only two hard things in computer science: cache invalidation, naming things, and off-by-one errors
Kaustubh G Sharma
Ranch Hand

Joined: May 13, 2010
Posts: 1270

fred rosenberger wrote:umm...My first thought (and I am by NO means an expert) is WHY?

On the surface, this sounds like a bad idea. What purpose does copying the records from one table to another in the same database serve?


it's not a task fred, this is a curious query of an interviewer, I don't have any idea, what he's is expecting from it. May be wanted to test, how good I am in seeking solution.
Martin Vajsar
Sheriff

Joined: Aug 22, 2010
Posts: 3606
    
  60

Oh, another interview question? In this case, I believe a reasonably good answer would be:
The correct approach is to run one insert into ... select statement. The wrong approach is anything else, particularly reading the source table using ResultSet and inserting individual rows back into the target table using PreparedStatement (or even just plain Statement), as in this case all the data travel needlessly back and forth between the database and the application.


You might also add that this operation is going to take some time (during which the application will be unresponsive), consume significant database resources, and might be really impossible to scale.
Lukas Eder
Ranch Hand

Joined: Jul 22, 2013
Posts: 41
    
    4

fred rosenberger wrote:umm...My first thought (and I am by NO means an expert) is WHY?


There are thousands of reasons. A very simple example would be copying records from TRANSACTIONS to TRANSACTIONS_ARCHIVE tables (if you don't use partitioning). Strictly speaking, you'd probably move the records, not copy them. But for the sake of this question, that's the same.

Martin Vajsar wrote:The correct approach is to run one insert into ... select statement. The wrong approach is anything else


I don't entirely agree with this. For millions of rows, using INSERT .. SELECT can prove to be a performance nightmare in databases with lots of concurrency. The reason for this is the fact that large parts of the source and target tables need to be copied into UNDO / REDO log space in order to ensure transactional integrity across this single statement. This can pretty much block the database engine in some circumstances.

A "good" way to tackle this problem would be to use PL/SQL (or T-SQL or whatever) and loop over smaller data subsets, inserting and committing them immediately, keeping UNDO space small.

Another "good" way could be to copy the records into a temporary table without indexes or constraints, releasing the source table read locks early, and moving the temporary table parts to the target table asynchronously in a separate transaction.

Martin Vajsar wrote:particularly reading the source table using ResultSet and inserting individual rows back into the target table


That, of course, is correct.


When Java and SQL work together, great software can evolve. That's why I have created jOOQ. Follow me on blog.jooq.org
Martin Vajsar
Sheriff

Joined: Aug 22, 2010
Posts: 3606
    
  60

Lukas Eder wrote:
Martin Vajsar wrote:The correct approach is to run one insert into ... select statement. The wrong approach is anything else

I don't entirely agree with this. For millions of rows, using INSERT .. SELECT can prove to be a performance nightmare in databases with lots of concurrency.

Well, there are databases where the single SQL statement is certainly the fastest possible way (assuming you size the UNDO for the job). Unfortunately I often forget there are also other databases beyond the one I use - thanks for pointing that out.

If this was an interview question, there was probably not enough detail to provide one correct answer. Perhaps just "do it in the database" is what was expected.
Lukas Eder
Ranch Hand

Joined: Jul 22, 2013
Posts: 41
    
    4

Martin Vajsar wrote:If this was an interview question, there was probably not enough detail to provide one correct answer. Perhaps just "do it in the database" is what was expected.


True.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Migration in same DB with best performance
 
Similar Threads
JVM size
PDF performance
Excel Browser Synchronization code
Performance of JVM of J2SE1.4, Java5, Java6
MySQL - Escape HTML Characters