Paul Sturrock wrote:
Goutham Pallipati wrote:
if you are checking whether there exists a record for given condition the I would suggest to use following instead of count.
because count often result in full table scans.
using count
Select count(1) from emp where empno=121;
using exists it becomes
select count(1) from emp where exists (select empno from emp where empno=121)
I don't follow this. If there is no index on empno, both queries run the risk of a full table scan. What difference does exists make?
In fact, is this even logically correct? The first will give you a count of all emp records where empno == 121, the second will increment the count when the subquery evalues to true and not re-evaluate the exists clause which I think will give a different result (unless I'm missing something)?
Paul,
There are even more issues with the suggestion than that...
You are correct that if empno has no index... there will be a full table scan but even more so since there isn't any correlation between the inner and outer query for the exists clause... there will be a full table scan on the outer query regardless of the presence of an index on empno in the example.
When using exists a correlation should exist between the inner and outer queries.
Exists is a Boolean and it evaluates for each row in the outer query... if the Exists clause evaluates to true, the row is returned, false if it evaluates to false. In this case, if an empno of 121 does NOT exist within the emp table... the entire table is scanned in the outer query with no rows returned.
If an empno of 121 does exist within the emp table... all rows of the emp table would be returned.
...and as you correctly pointed out... if there is no index, there is a full table scan for both the inner and outer query.
Even more to the point... the count(*) function is a useful aggregation function that avoids the need to do multiple sub-queries (with Select 1 and Group By and Sum()) to return useful counts of data occurrences. It allows you to do something in a single query that would take you at minimum 2 queries, a group by clause, and sum function to duplicate the same results. It isn't an inefficient SQL tool. It is a useful one when needed.
Paul