# Combinatorial queries.

Claude Moore
Ranch Hand
Posts: 800
7
Hullo everybody,

Scenario: let's suppose I've to query a table with a query like

SELECT F1, F2, F3 FROM MYTABLE WHERE F1 = ? AND F2 = ? AND F3 = ?.

I need to execute this query changing parameters' actual values in a combinatorial way,
until a combination gives back at least one row or all combinations are unsuccesfully tried.
For istance, I may have this sequence of values:

(V1,V2,V3);
(V1,V2,"");
(V1,"",V3);
(V1,"","");

where V1, V2, V3 may by string values and V1 is always present, not null, and not blank in each
combination.

A first strategy may be to prepare the statement, clear the parameters each time I execute the
query, until stop condition is met.

I wonder if may be more efficient transform the query into

SELECT F1, F2, F3 FROM MYTABLE WHERE F1 = V1

and cycling over the cursor and, for each cursor row, verify if the returned tuple
(F1,F2,F3) matches the combination (V1x,V2x,V3x). When at least 1 rows matches,
or all combination are done, I'll exit iteration.

What do you think about ?

Thanks.

Jayesh A Lalwani
Rancher
Posts: 2756
32
Depends on how selective V2 and V3 is. If you are getting a million rows from the DB, and then filtering them down to 10, then it would be very inefficient. As must as possible, you should let the DB do the filtering

Why can't you just use OR clauses?

SELECT F1, F2, F3 FROM MYTABLE WHERE F1 = ? AND (F2 = ? OR f2="") AND (F3 = ? or F3="")

IMO, for most databases this will be most efficient. Internally, most databases will figure out which of those clauses are most selective, and use that to read the rows of the disk. Then, it will do exactly what you are planning to do yourself. The difference is that you are letting the database decide how to minimize the IO

Edit:

Since you just want the first row that matches, you can put an order by clause and tell the DB to get you only 1 row. Most DBs have clauses that allow you to limit rows. This would probably make it a lot more efficient if you have indexes on the columns. It;s hard to say without knowing the DB, and the makeup of your data. You might want to use the QUery plan that the DB generates for you to see what exactly the DB is doing.

Claude Moore
Ranch Hand
Posts: 800
7
Hi Jayesh,

thanks for you reply, first of all.

I need to be more specific to illustrate my working scenario. What I'm trying to obtain is a mechanism to query attributes of particular
entities hierarchically organized, where F1..Fn are the composite keys of those objects ... an example may simplify our reasoning.

Let's suppose that a Product may be specialized by "customerID" attribute, and a table X stores attributes for my product entities.
I need when processing an order to read these attributes: if i found a full key (productID, customerID) I can use corrispondent records,
otherwise I need to query using only (productID, "") key, as a "fall back" option.

In my scenario, whene such "hierarchical query" is adopted, I've never millions of records; the order of magnitude is about an hundred
of records (in worst scenario). You're perfectly right about the fact that, generally speaking, read 1 million rows to shrink down later
the result set to a few records is a very bad idea.

Jayesh A Lalwani
Rancher
Posts: 2756
32
Well, from what I see, you have 2 options
a) Run 6 queries, one at a time until you get an answer
b) Run 1 query that queries for all the data, sorts it according to your "fall back" rules, and then just returns the first row

Both will work. However, from a performance POV, my gut feel says the second one would be better in most cases. However, don't trust my gut. The database gives you tools that let you measure the impact of each query. Most DBs have some way of getting the query plan, that also give you the cost of running the query. It will be better if you run the explain plan on real data. Try both options out and look at the best case/typical case/worst case cost of both options

Claude Moore
Ranch Hand
Posts: 800
7
How would you order by result by search combination criterion ? I cannot figure out how achieve this using an order by clause...

Jayesh A Lalwani
Rancher
Posts: 2756
32
• 1
Well, since empty string always evaluates to be less than non-empty string, you can sort descending

SELECT F1, F2, F3 FROM MYTABLE WHERE F1 = ? AND (F2 = ? OR f2="") AND (F3 = ? or F3="")
order by f2 desc, f3 desc

If you have a fall back logic that is not satisfied with the sql above, then you might have to look at your database to see how you can do custom sorting. FOr example, Oracle can do

SELECT F1, F2, F3
FROM MYTABLE
WHERE F1 = ? AND (F2 = ? OR f2="") AND (F3 = ? or F3="")
order by
case when f2<> "" and f3 <> "" then 1
case when f2<> "" and f3 == "" then 2
case when f2== "" and f3 <> "" then 3
else 4

Claude Moore
Ranch Hand
Posts: 800
7
• 1
Thank you so much... this forum's really a place where one can learn a lot.
Thanks again.

Martin Vajsar
Sheriff
Posts: 3752
62
I think Jayesh's advice deserves a cow

Just one small nitpick regarding Oracle: an empty string in Oracle is actually a NULL. Therefore the expression '' = '' will evaluate to false. To test that an expression is an empty string, one must use expression is null.

Sorry for being such a pedant, but this concept in Oracle can cause headaches to someone who's not used to it.

Jayesh A Lalwani
Rancher
Posts: 2756
32
Thanks, Martin.

Yes, agreed on the nitpick. I'm getting too old to keep the pecularities of the databases straight

Claude Moore
Ranch Hand
Posts: 800
7
Martin Vajsar wrote:I think Jayesh's advice deserves a cow .

Totally agree. I hope Javesh will offer me a cup of milk of his new cow

Jayesh A Lalwani
Rancher
Posts: 2756
32
All my cows are bulls. You want to milk them. You help yourself :p