This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
I have a query that provided a given parameter returns 0 rows within a second when running from SQl Server GUI tool. However the same query takes about 15 mins!! when I run from within a Java app with a JDBC PreparedStatement. I realize there is some inefficiency because of a CAST in a join clause (and ISDATE in WHERE) but still, it returns in a second from the GUI and I don't have the option right now of changing the schema. If anyone can explain what could possibly be behind this please let me know. Here is the query:
select pd.ssn, pd.lastname, pd.firstname, count(c.customerid) from payrolldetail pd join payrollheader ph ON ph.payrollheaderid = pd.payrollheaderid
JOIN customer c ON c.lastname = pd.lastname and c.firstname= pd.firstname and c.username != pd.ssn
JOIN accountenrollment ae on ae.accountsprovidedid = ph.accountsprovidedid AND ae.customerid = c.customerid
join participantdate pdt on pdt.ssn = c.username and pdt.accountsprovidedid = ae.accountsprovidedid AND
pdt.dateofbirth = CAST(pd.dateofbirth AS datetime)
where pd.payrollheaderid = ? and ISDATE(pd.dateofbirth)=1
GROUP BY pd.ssn, pd.lastname, pd.firstname HAVING count(c.customerid) > 0
Query optimization is a non-trivial topic and there could be a lot of factors affecting the query. My recommendation, as with any issue of queries running slow, is to subtract the query to the bear minimum and slowly add joins/conditionals until you find any clause that causes a serious delay. Then, either optimize that part of the query or add an index to speed it up.
As far as the difference between the two, there are a number of options you can specify on the JDBC connection string to change the behavior of the connection. Also, make sure you are using the same username/password as SQL Server does have permission issues that can be seen by using different users.
As a side note- using functions in conditional clauses can be extremely slow, specifically if it has to apply the function to the entire table. They should be avoided at all costs, possibly using denormalized columns.
Thanks, I found it. There is a setting sendStringParametersAsUnicode for the driver that I've set as "false". This did it. The query was in a process that uploaded, & validated a 4600 line payroll spreadsheet. It took 20 minutes yesterday, now it just took 52 seconds. I realize that using the functions in the conditional clause is not best, but I didn't have the immediate option of altering the schema/datatypes (on my to do list). Thanks
Do what you can to eliminate the function call, perhaps by performing the operation when inserting the data as a second column. While databases can take advantage of indexes to improve perform, function calls throw indexes out the window, rendering them useless as the function needs to be applied to every record in the table. I suspect you could reduce your query to under a few seconds if the function calls were removed and proper indexes were added.
I thought about this issue some more and decided to write up a blog article about it. Despite the problem being a driver issue, it still makes sense to eliminate function calls as much as possible in high profile queries.