I need to generate a huge table from reading a another huge input table. Here is the requirement:
1. Read a row from the input table, do calculation, generate a target row.
2. The generated target row may or may not be generated before i.e. the same PK may already be generated by a previous input row.
3. If the PK already generated, I would do an 'update'. If it isnt, I would do an 'insert'.
In order to verify if the PK've already been generated, my approach is to issue a 'select pk from target_table' for each generated target row to see if the PK is there or not.
The problem is, for each input row, the 'select' verification potentially means scanning the table once. The would means a huge performance problem.
I am just wondering is there better way to check if the PK already generated in step 3 above? I am thinking about caching the whole generated target table in memory so that the duplication checking is done against memory image, but is there any built-in support provided by JDBC rather than manually maintaining the cache.
I know nothing about Rowset but heard of Cached RowSet. Is the a cache of db row that I can use for my requirement here?
Alec, If you are using Oracle database version 9i (or above), then I would use the MERGE statement.
If you're not using Oracle, then the database you are using may have something similar to Oracle's MERGE statement.
If your database isn't that clever, then I would try the update first. The "executeUpdate()" method returns the number of rows that were updated. If it returns 0 (zero), you know there is no such row in the table and so you would then do an insert.
From your description, it seems (to me) that in (your) step 1 (one), you are first reading a row, then doing a calculation. You may be able to incorporate the calculation into your query. That way, you read an already calculated row, which may save you some time -- but you would need to time both approaches to verify this.
Good Luck, Avi.
Joined: Jan 28, 2004
But I dont understanding how Oracle's MERGE can do the trick here. Does that mean i need to create a temporary table containing the rows generated and do a MERGE regularly.
My program will generate 1 *ROW* at a time. But the MERGE statement expects a *TABLE* containing all the rows to be merged into the target table.
Could you provide some specific code to illustrate this. Thx for your help.
I need to generate a huge table from reading another huge input table
To me, this means you have defined two database tables -- one contains data and the other one does not. You want to copy data from the table containing data to the empty table. This is what MERGE is intended to do. So I guess you are saying that my interpretation of your question is wrong, right?
From your reply, I also assume that you are, indeed, using an Oracle database? Is this a correct assumption?
Example code for the MERGE statement is available in the Oracle documentation. And the Oracle documentation is available from: