aspose file tools*
The moose likes Performance and the fly likes Query Optimization Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Performance
Bookmark "Query Optimization" Watch "Query Optimization" New topic
Forums: JDBC Performance
Author

Query Optimization

Vaibhav G Garg
Ranch Hand

Joined: Sep 23, 2011
Posts: 140
Hi,

I have two tables Employee and Department with the following columns:

Employee => EmpId, EmpName, EmpSalary, DeptId
Department => DeptId, DeptName

I want to find out the deptid and average salary of the employees in the department. The employee table has 1 million rows or more. So, need best approach to get the desired result.

Below is the query given:

select deptId, avg(EmpSalary) from Employee group by DeptId;

Can we update this query or use some other approach to handle 1 million rows.

Thanks,
Vaibhav Garg
Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 30309
    
150

Vaibhav,
The first question is whether the query is really too slow. It might not be. While a million rows sounds like a lot, computers are really fast.

Unless you are oversimplifying the problem/table, there isn't going to be anything you can do to the query to make it faster. It is already a very simple query and there aren't a lot of extra fields in the table.

Which brings us back to whether the query is really too slow, how often you run it and how fast you need it to be. Options include caching the results in Java, running it for only one dept id (if you aren't using them all) or creating an aggregation table that stores the average that you update every time any employee has a salary change. Which increases the cost of those updates.


[Blog] [JavaRanch FAQ] [How To Ask Questions The Smart Way] [Book Promos]
Blogging on Certs: SCEA Part 1, Part 2 & 3, Core Spring 3, OCAJP, OCPJP beta, TOGAF part 1 and part 2
pramod mhatre
Greenhorn

Joined: Dec 27, 2012
Posts: 11
Dear Vaibhav,

Is it really slow? If yes then please check whether index created on column from where clause. If its not then create it. It will reduce processing time more than 50%. Moreover this is simple query and if you still want to optimize it then check whether you can use temp where you can get salary and department id in temp and then use avg function on it.

Hope it will work ...
Ivan Jozsef Balazs
Rancher

Joined: May 22, 2012
Posts: 867
    
    5
pramod mhatre wrote:whether index created on column from where clause.


The cited very simple (straightforward) select statement does not include a "where" clause.

It is however an interesting question whether the presence or absence of an index on the column deptId has an effect on the execution speed of the query.
Martin Vajsar
Sheriff

Joined: Aug 22, 2010
Posts: 3610
    
  60

Ivan Jozsef Balazs wrote:It is however an interesting question whether the presence or absence of an index on the column deptId has an effect on the execution speed of the query.

It is, of course, database dependent, but I'd say it doesn't. Generally, the database has to visit all rows to answer the query, and therefore it won't use an index.

What might help in some databases (eg. Oracle) would be a composite index on the deptId and EmpSalary columns. If the table contained significantly more columns than these two, the index might take less space than the whole table, and since all the data needed to answer the query are contained in the index, the DB might read the whole index instead of the table and save some IO operations, thus possibly speeding things up.

Also, this statement is a bit problematic:
pramod mhatre wrote:If yes then please check whether index created on column from where clause. If its not then create it. It will reduce processing time more than 50%.

(Emphasis added)

Even in simple queries with just a WHERE clause, how much an index will help depends on a lot of things, such as selectivity and the total number of rows in the table. I believe that in most databases it is possible to create a scenario in which adding an index makes even such a simple query perform worse.

Indexes are not magical devices that will instantly speed everything up by X percent. Without understanding at least the very basics of how indexes and databases work (and knowing the structure and quantity of the data), it is not possible to say with any certainty whether an index will help or not. I know this very well - at the beginning of my career, I made exactly this mistake, throwing indexes randomly around and being surprised they didn't help.
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7652
    
  19

Vaibhav G Garg wrote:The employee table has 1 million rows or more...

Really? That would seem to narrow it down to only two possibilities: Walmart, or the National Grid Corporation of China. Which one do you work for?

Winston


Isn't it funny how there's always time and money enough to do it WRONG?
Articles by Winston can be found here
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Query Optimization