I want to write a query to delete duplicate rows from a table ,suppose there is emp table it has two fields emp id and emp name ,there is no primary key . and there are some duplicate rows,I want to delete all duplicate rows based on the empid . Can anybody suggest?
SQL is intended to work against valid relational data. A relation with no PK is not valid, since there is no way to identify it (I wish databases were implemented not to allow table creation without a PK). I'd recreate the table with a primary key and select the data into it. Use "distinct" to get the distinct empid (which can become your PK).
If you implement the table with empid as a primary key there will be no duplicates.
There is no easy way to use SQL to delete duplicates on a table, because (as I said) SQL is intended to be used on relational data and an entity with no primary key will not contain relational data.
You could write a procedure which uses a cursor to go through the table, adds the employee row to a temp. table (after checking that it is not there already). Then you could delete your temp. table, properly recreate it with a primary key and insert the data from your temp. table into it. You follow?
Joined: Aug 30, 2005
well,I mean to say if there is employee id is primary ,but names can be duplicated then what should be approach for that.
Duplicate names? If your PK is just the empid field then your design is allowing dupicate emp names. If you don't want that to happen, define both fields as part of a composite key, or add a unique constraint to the emp names field.
If you want to tolerate duplicate emp names, but have some piece of functionality which requires a list of all the emp names, ignoring duplicates (i.e. a set of emp names) use the distinct keyword in your select statement.