Win a copy of Mesos in Action this week in the Cloud/Virtualizaton forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

select distinct

 
natasa jones
Ranch Hand
Posts: 45
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all!
i have this query:

select a_id, a.description,a.year from tableOne a,tableTwo b where b.id in (3,6,5) and a.id=b.id limit 3 .

what i get is some rows which all are of the same id.from example are the returned products are of the id 3.but what i really want,is to get three results and each of them should have different id.so the first should have id 3,the second id 6 and the last 5.i want to ask how i can take the rows i want,using distinct for one of them...
something like :

select a_id, a.description,a.year distinct a._id from tableOne a,tableTwo b where b.id in (3,6,5) and a.id=b.id limit 3 doesnt work!

any thoughts?
thanks!
 
Jeanne Boyarsky
author & internet detective
Marshal
Posts: 34422
347
Eclipse IDE Java VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Natasa,
Which record do you want to return for each id - one at random? Also, is it ok to assume this is a simplified version of the real query? As written, table b doesn't need to be in the query at all. (I've removed table b in my examples.)

I can think of a few approaches to this:
1) Do three queries with a limit of 1.
2) Do three queries and union them if your database supports an embedded limit.
3) Use a group by getting records that may not go together: (depends on what you are trying to accomplish)
select a_id, max(a.description), max(a.year)
from tableOne a
where a.id in (3,6,5)
group by a.id
limit 3
4) Use a subquery on the primary key of your table: (this one returns an accurate result - an improvement of #3)
select a_id, a.description, a.year
from tableOne a
where a.id in (3,6,5) and a.year = (select max(a2.year) from tableOne a2 where a2.id = a.id)
 
natasa jones
Ranch Hand
Posts: 45
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
well yes it is a simplified versionno i would like to have 3 results from this query,not one,but its of them i would like to have a different id,(but this id should have a value from the list of three possible values each time)..
 
Layne Lund
Ranch Hand
Posts: 3061
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Let's say you have data in this table as follows:


The query you gave (select a_id, a.description,a.year distinct) will return both of these rows since the ordered triple for each of this rows is different (i.e. at least one column has a different value). "distinct" will only eliminate rows when all three columns have the exact same value. If you want just one row with the id of 3, which of these two rows should be returned?

I apologize that I don't have any suggestions how to solve your problem. I just wanted to help clairfy how "distinct" works in a SQL query.

Layne
 
natasa jones
Ranch Hand
Posts: 45
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hmm you are right then!distinct isnt even the correct expression to use!
it doesnt matter which of two! i just want the results to have different ids.after that it could just pick which row with that id it will bring!even better if everytime it picks a different row!!any ideas on how to do that?
 
Jeanne Boyarsky
author & internet detective
Marshal
Posts: 34422
347
Eclipse IDE Java VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Jeanne Boyarsky:
4) Use a subquery on the primary key of your table: (this one returns an accurate result - an improvement of #3)
select a_id, a.description, a.year
from tableOne a
where a.id in (3,6,5) and a.year = (select max(a2.year) from tableOne a2 where a2.id = a.id)


Natasa,
Since it doesn't matter which row is returned, option #4 is the general idea. It always returns the latest result with that id assuming that year is distinct. If it is not, we need to know more about your schema. What is the primary key on this table?
 
natasa jones
Ranch Hand
Posts: 45
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi!
i will describe my tables in more detail!
my first table,tableOne lets say a,has rows a.id(primary key),a.product_id,a.year,....
the second table ,tableTwo,b has rows b.product_id(primary_key),b.description.
i want my application,each time,a registered user loggs in,to suggest to him 3 products,according to his preferences.so list in(3,6,5)where 3,6,7 are ids for preferences is an example,and different perhaps for each user.but i dont know how to make it show one product of each category!so three products of different categories.
 
Jeanne Boyarsky
author & internet detective
Marshal
Posts: 34422
347
Eclipse IDE Java VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Natasa,
The details help alot. I oversimplified the problem by removing table b from my example!

Try this:


It's the same query you had in the first post with an additional and clause. That and clause makes it distinct by only using one (the maximum) product id that meets the id requirements in your in clause.

Note that this will perform slowly when compared to the initial query. If performance is important, you may want to do some tests comparing this approach to three database roundtrips.
 
natasa jones
Ranch Hand
Posts: 45
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
i used it,but it keeps bringing results of same category id...
 
George Stoianov
Ranch Hand
Posts: 94
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello,

It seems to me that you are trying to achieve something of a random result but within certain constraints. To help you with this I think knowing what the data looks like and what the final output should look like is very beneficial...if you can share those please do so. Your respective rdbms engine is of equal importance.

If you want to randomize records you need to use a random function in the corresponding database and that will best work with a number which is a primary or a foreign key. If you stick a bunch of random functions in the same query and combine input from different tables you can still achieve this but I doubt anyone can help you without actually knowing more about the data how it is collected and entered and seeing the actual records. Databases have functions etc. but getting the stuff out in the right format in some circumstances is quite an art, that is why I like the suggestion that you actually combine these randomized results before hand and then put them together for the user.

To improver performance you may actually pre-make this table for all users if feasible.

My 2 cents.
George
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic