Q
Problem solve Get help with specific problems with your technologies, process and projects.

Delete all but two rows in each group

I have a table with several columns in the primary key, one of which is timestamp. For each Col4 value, I want to prune the table to make sure that only the two most recent times are in the table for each unique Col2, Col3. I am getting wrapped around group/distinct statements here. I think I need an inner join to select values.

I have a table with several columns in the primary key, one of which is timestamp. For example,

Table1
Col1*  Col2*   Col3*  Col4*  Col5 ....
time0   X       Y      A
time1   X       Y      A
time2   X       Y      A
time01  X       Y      B

The asterisk means that the column is part of the primary key definition for the table. For each Col4 value, I want to prune the table to make sure that only the two most recent times are in the table for each unique Col2, Col3. I am getting wrapped around group/distinct statements here. I think I need an inner join to select values, kind of like I would be getting rid of duplicate values in the table, but I just can't figure it out. Thanks for any help.

Before starting any procedure that involves deleting, it's a good idea to take a backup first.

Here's a query that will select only the rows you want to keep --

select Col1, Col2, Col3, Col4, Col5
  from Table1 as T
 where 2
     > ( select count(*)
           from Table1
          where Col4 = T.Col4
            and Col3 = T.Col3
            and Col2 = T.Col2
            and Col1 > T.Col1 )

Now the challenge is to delete the rows that don't satisfy the above condition. Be warned that it can be tricky to successfully "invert" this logical condition. Do you use NOT? Do you reverse the greater than? If so, which one? Both of them?

Some databases will allow you to delete from the same table that you're selecting from, but even in those databases, it's usually safer to save what you want to keep in a holding table first. Then you would delete from the original table where a matching row doesn't exist in the holding table, or alternatively, drop the original table and rename the holding table as the original table. Either way, you achieve peace of mind by being able to see which rows you are keeping before attempting the delete.

This was last published in May 2006

Dig Deeper on Oracle and SQL

PRO+

Content

Find more PRO+ content and other member only offers, here.

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

Start the conversation

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

-ADS BY GOOGLE

SearchDataManagement

SearchBusinessAnalytics

SearchSAP

SearchSQLServer

TheServerSide

SearchDataCenter

SearchContentManagement

SearchFinancialApplications

Close