aspose file tools*
The moose likes JDBC and the fly likes How to delete duplicate rows Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Databases » JDBC
Bookmark "How to delete duplicate rows" Watch "How to delete duplicate rows" New topic
Author

How to delete duplicate rows

Neha Dhaka
Greenhorn

Joined: Aug 30, 2005
Posts: 29
I want to write a query to delete duplicate rows from a table ,suppose there is emp table it has two fields emp id and emp name ,there is no primary key .
and there are some duplicate rows,I want to delete all duplicate rows based on the empid .
Can anybody suggest?
Paul Sturrock
Bartender

Joined: Apr 14, 2004
Posts: 10336


there is no primary key

SQL is intended to work against valid relational data. A relation with no PK is not valid, since there is no way to identify it (I wish databases were implemented not to allow table creation without a PK). I'd recreate the table with a primary key and select the data into it. Use "distinct" to get the distinct empid (which can become your PK).


JavaRanch FAQ HowToAskQuestionsOnJavaRanch
Neha Dhaka
Greenhorn

Joined: Aug 30, 2005
Posts: 29
suppose we consider employee id as primary then how it can be done in sql?

Raja
Paul Sturrock
Bartender

Joined: Apr 14, 2004
Posts: 10336

If you implement the table with empid as a primary key there will be no duplicates.

There is no easy way to use SQL to delete duplicates on a table, because (as I said) SQL is intended to be used on relational data and an entity with no primary key will not contain relational data.

You could write a procedure which uses a cursor to go through the table, adds the employee row to a temp. table (after checking that it is not there already). Then you could delete your temp. table, properly recreate it with a primary key and insert the data from your temp. table into it. You follow?
Neha Dhaka
Greenhorn

Joined: Aug 30, 2005
Posts: 29
well,I mean to say if there is employee id is primary ,but names can be duplicated then what should be approach for that.

Raja
Paul Sturrock
Bartender

Joined: Apr 14, 2004
Posts: 10336

Duplicate names? If your PK is just the empid field then your design is allowing dupicate emp names. If you don't want that to happen, define both fields as part of a composite key, or add a unique constraint to the emp names field.

If you want to tolerate duplicate emp names, but have some piece of functionality which requires a list of all the emp names, ignoring duplicates (i.e. a set of emp names) use the distinct keyword in your select statement.
Sunil Dumpala
Greenhorn

Joined: Jul 28, 2005
Posts: 29
Try the following. But I am not sure if this will work for all databases.

DELETE FROM emp
WHERE rowid NOT IN
(SELECT max(rowid) FROM emp
GROUP BY emp id
HAVING count(*)>=1)
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
 
subject: How to delete duplicate rows