This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
select x.a, y.b from table1 x join table2 y on x.id = y.id where y.type in ('xxx', 'xxxx', 'xxxxx') and x.c = 'something'
table x is very big, 30 million rows. y.id is the key of table y.
The execution of this query is a few second, but when I iterate through the resultset, it takes me 10 minutes. This is too long. There are 800000 records in the resultset.
If I do it in this way: select x.a, y.b from table1 x join table2 y on x.id = y.id where y.type in ('xxx') and x.c = 'something' UNION select x.a, y.b from table1 x join table2 y on x.id = y.id where y.type in ('xxxx') and x.c = 'something' UNION select x.a, y.b from table1 x join table2 y on x.id = y.id where y.type in ('xxxxx') and x.c = 'something'
The exeuciton will take me 5 minutes, and iteration through the resultset is just 1 second.
Why like that?
And any way to minimize both statement execution time and resultset iteration time?
What you've stumbled on is a very common issue in performance tuning and can have nothing to do with JDBC. In particular, depending on the indexes of your records (and memory available) the query may be run on a streamlined index of the data rather the data itself. For example, if you take age>15 for a user table and there is a sorted index on age, getting the list of records that match this query may be extremely fast, but actually reading the records since this is not stored in the index or in memory can be extremely costly.
One tip is to store the projection information you want to read as another index, in this way you can both query and retrieve information on the index of the data rather than ever reading the data itself. Another way in the JDBC context is to set your prefetch size so records are grouped into larger set, although there's no guarentee this will help.
As I've said this is an issue with database managemenent, if you need the entire record (small projection of columns not useful) and the data is big (>2 gigs) it cant fit into memory and may take long to read no matter how to store it. In oracle you can set table spaces and store a table in multiple files based on attributes. 9 times out of 10 appropriate database indexes will solve this but there are other times its much harder. [ December 07, 2006: Message edited by: Scott Selikoff ]