Order of SQL WHERE clause conditions & database table with index
Joined: Feb 05, 2004
There is a rule of thumb for database multicolumn index that says "put the most selective column first in the index definition". Usually it will give better performance. But, did the ordering of WHERE clause conditions have performance impact to the query on a table with multicolumn index? Consider this scenario:
Table BID has columns USERID, ITEMID, DATE. Assuming USERID has highest selectivity and DATE has lowest selectivity. The following multicolumn index was created IDX_BID(USERID, ITEMID, DATE).
Does the following two SQL statements have a difference in their performance? 1. SELECT * FROM BID WHERE USERID=? AND ITEMID=? AND DATE=? 2. SELECT * FROM BID WHERE DATE=? AND ITEMID=? AND USERID=?
Joined: Feb 15, 2006
In theory, exactly what plans the optimizer is going to come up with should be determined by the indexes available and their stats (as well as some heuristics only the optimizer knows about). In your particlular case, since the multicolumn index contains all the data needed to resolve the conditions specified in both queries, that index will make a very good candidate to be chosen as the primary place to look at, regardless of the order in which the conditions appear in the statement.
But in practice, I have heard so many horror stories of optimizers sometimes doing their work in absolutely weird, illogical ways that I would not be sure of anything unless I check out the execution plan(s).
<a href="http://webjavenue.com/" target="_blank" rel="nofollow">Your first website in Java: easier than you think</a>
Joined: Dec 15, 2005
It will depend on the optimization techniques that the database uses. I can only speak specifically about Oracle database.
With Oracle, the are 2 different optimization engines, the Cost-Based Optimizer (CBO) and the Rule-Based Optimzer (RBO). The RBO is the older and has been deprecated for maybe 6 or 8 years, it hasn't improved since Oracle 7 or 8; I think its no longer available in Oracle 10g, and if it is available, you still don't want to use it. The CBO is newer, better, and has all the recent improvements; I believe it's the default now, and tough to avoid using if you wanted to...
The CBO doesn't care at all the order of any element in the SQL statement; a large number of "likely" execution plans are evaluated, each sub-part is given a cost based on meta information about the data that the database maintains, and the execution plan with the lowest estimated cost is chosen to execute.
With the old RBO, a series of heuristic rules are evaluated in order; in some of the rules, order in the FROM clause is a documented "tie breaker" in determining the order in which the tables are joined. Some people have also claimed that order in the WHERE clause is a similar tie-breaker under certain circumstances, but I'm not aware of any documentation to support that.
However, for the simple query you're asking about, even the RBO will certainly pick the same execution plan no matter what order is put in the WHERE clause. You would probably need to be joining more than 5 tables (or have a similar number of sub-queries) before it might matter. Most modern databases are going to be as smart as Oracle is, at that level of complexity.
Of course, the easiest way to settle the question for your database is to set up a testcase and time it.