I am trying to check soem data within a certain data and that has a value in one of the columns. Trying something like this:
but this is taking a long time to run. Something like 30-45 minutes. Even if I try to change it to look at values within one day:
it is taking a while (right now it is on 5 minutes and counting...)
So am I doing something noobish that is making this query take way longer than it should (something to do with how I specify date range I would guess - although I have tried variations on how I do this)? Any tips to make this faster would be great.
True wisdom is in knowing you know nothing - Socrates
I bet that your second query will take more or less the same time. Your database is probably "fullscanning" (reading all of) the table to get the answer. The second run might be faster though, if parts of the table are already in cache.
How many rows satisfy the WHERE condition of your query? If it is just a fraction (say, up to 5%) of all rows in the table, then adding an index on the creation column (perhaps combined with the attribute1 column) should help.
Details depend on the database you're using. In some databases (eg. Oracle) it would be advisable to gather optimizer statistics after you add the index, so that the optimizer can come up with optimal execution plans.
There is not much to improve in your query. You should avoid "SELECT *" and enumerate only the columns you need in your query - it might save some network bandwidth by not fetching columns you don't need (and its a good thing to do anyway, as the list of selected columns becomes fixed and self-documented), but that's about all.
Creating an index will help to an extent. But looking at the query, even with the index, the query might take a lot of time to bring out the results (the time taken varies depending on the amount of data the DB has to process for your query predicates).
The best way to make this query run faster is provide any other good predicate to the where clause which brings out only the required data .
Matt Kohanek wrote:Thanks for all the tips everyone - turns out not much can be done as you say - I just have hundreds of thousands of records to go through so there is not much to be done
Are you using Oracle? If so, then hundreds of thousands of records is not all that much data, assuming you have a reasonable DB server.
You might also be doing an implicit type conversion on your BETWEEN values i.e. they are character strings but you are apparently comparing them with a date value. Try using TO_DATE on the strings to make sure the values are actually dates before they get compared with the table column.
An index on your CREATION_DATE column would definitely speed this query up, assuming you would normally expect to fetch only a small proportion of the records.
Try running the query with each condition separately and see which one seems to run faster. This should tell you which condition is slowing the original query down.
As others have suggested, don't use "SELECT *" if you can avoid it, just select the columns you need. Try running the query by just fetching one column initially to see if that makes a difference.
Try "SELECT COUNT(*) FROM..." or use "...AND ROWNUM <= 10" to reduce the amount of data you actually pass back from the query. If this makes it return much faster, then the query is finding the data quickly but taking a long time to pass it back.
And run an explain plan to see where the query is spending its time.
How are you executing this query? Try running it directly in your SQL environment e.g. SQL*Plus, to remove any overhead due to ORM issues etc.
If none of that works, talk to your DBA and see if they have any bright ideas.
Though your advice is very valuable, I'd have a small clarification:
chris webster wrote:
Using "IS NULL" or "IS NOT NULL" disables the use of an index on a column, even if you had an index on it.
Definitely not, at least in Oracle. Composite indexes do contain null values, assuming at least one of the columns in the index is non-null. Only records with nulls in all columns in the index are not indexed. In this case a composite index would help, because the date column would not be null.
Moreover, IS NOT NULL queries can be answered using even a simple index, because all the rows matching the condition will be in the index. If the column contains a lot on null values and just a few non-nulls, optimizer will certainly use the index to get these rows.
More NULL handling tricks can be pulled by using function based indexes, but this is beyond the scope of the original question.