Problem solve Get help with specific problems with your technologies, process and projects.

Removing duplicate tuples

How to remove duplicate tuples using the DELETE command in SQL*Plus.

Consider the following table:

employee(empno, ename, job);

Here we assume that empno is neither a primary key nor a unique key. But it is also found that the table contains duplicate tuples in reference to the empno, which is undesirable. Now the task is to remove such duplicate tuples from the relation uing the DELETE command in Oracle SQL*PLUS.

This can be done as follows:

delete from employee a
where rowid > (select min(rowid)
  from employee b
  where a.empno=b.empno);

Reader Feedback

Dave Y. writes: The SQL statement to delete duplicate tuples is seriously flawed. Notice that it is deleting all the tuples in the table that have a rowid greater than the first duplicate tuple encountered. This means that you are likely deleting a lot of good data. The appropriate syntax for this SQL would be:

delete from  employee a
      where  rowid =  ( select  rowid 
                          from  employee b
                         where  a.empno = b.empno
                           and  a.rowid < b.rowid  )
This SQL will leave one of the duplicate tuples intact and delete the others. If you do not wish to leave one of the duplicate tuples intact but instead prefer to delete both the original and the duplicate tuple (at which point it would behoove you to identify a source to obtain the correct version of the tuple and insert it again to avoid losing the data) you simply change the very last predicate in the above example to read

    and a.rowid <> b.row_id    

at which point the command will delete every record in the table where another record exists which has the same empno.

Anagnostou V. writes: Very useful tip for a very common situation in data warehousing. Keep up the good work!

The author responds: I think Dave above has missed something while looking at my query:

delete from employee a
where rowid > (select min(rowid)
  from employee b
  where a.empno=b.empno);

Here you can see that the outer query will delete those tuples that have a rowid greater than the rowids of the tuple having duplicate values of empno only. This will be so for all those empno's that are duplicate. So I don't see any flaws in the tip I have submitted.

For More Information

  • What do you think about this tip? E-mail the editor at [email protected] with your feedback.
  • The Best Oracle Web Links: tips, tutorials, scripts, and more.
  • Have an Oracle tip to offer your fellow DBAs and developers? The best tips submitted will receive a cool prize--submit your tip today!
  • Ask your technical Oracle questions--or help out your peers by answering them--in our live discussion forums.
  • Check out our Ask the Experts feature: Our SQL, database design, Oracle, SQL Server, DB2, metadata, and data warehousing gurus are waiting to answer your toughest questions.

Dig Deeper on Oracle and SQL