Q

Top N rows for each X

I want a query for retrieving the top two recently inserted records for each id. My table has columns id, adate, and description. There are five records with id 2 and four records with id 3. I want to retrieve top two records for both id 2 and 3. I should get total four records, two with id 2 and two with id 3.

Consider the following data:

id  adate     description
 2 2005-02-11 1st-of-5
 2 2005-02-12 2nd-of-5
 2 2005-02-13 3rd-of-5
 2 2005-02-14 4th-of-5
 2 2005-02-15 5th-of-5
 3 2005-03-01 1st-of-4
 3 2005-03-02 2nd-of-4
 3 2005-03-03 3rd-of-4
 3 2005-03-04 4th-of-4

Now let's set up a theta join. A theta join uses an inequality in the join condition. The join will be a self-join, and each row will be joined to all rows for the same id which have an equal or later date.

select t1.id
     , t1.adate
     , t1.description
     , t2.id
     , t2.adate
     , t2.description
  from thetable as t1
inner
  join thetable as t2
    on t1.id = t2.id
   and t1.adate <= t2.adate
order 
    by t1.id
     , t1.adate
     , t2.adate

The above query produces these results:

 2 2005-02-11 1st-of-5   2 2005-02-11 1st-of-5
 2 2005-02-11 1st-of-5   2 2005-02-12 2nd-of-5
 2 2005-02-11 1st-of-5   2 2005-02-13 3rd-of-5
 2 2005-02-11 1st-of-5   2 2005-02-14 4th-of-5
 2 2005-02-11 1st-of-5   2 2005-02-15 5th-of-5

 2 2005-02-12 2nd-of-5   2 2005-02-12 2nd-of-5
 2 2005-02-12 2nd-of-5   2 2005-02-13 3rd-of-5
 2 2005-02-12 2nd-of-5   2 2005-02-14 4th-of-5
 2 2005-02-12 2nd-of-5   2 2005-02-15 5th-of-5

 2 2005-02-13 3rd-of-5   2 2005-02-13 3rd-of-5
 2 2005-02-13 3rd-of-5   2 2005-02-14 4th-of-5
 2 2005-02-13 3rd-of-5   2 2005-02-15 5th-of-5

 2 2005-02-14 4th-of-5   2 2005-02-14 4th-of-5
 2 2005-02-14 4th-of-5   2 2005-02-15 5th-of-5

 2 2005-02-15 5th-of-5   2 2005-02-15 5th-of-5

 3 2005-03-01 1st-of-4   3 2005-03-01 1st-of-4
 3 2005-03-01 1st-of-4   3 2005-03-02 2nd-of-4
 3 2005-03-01 1st-of-4   3 2005-03-03 3rd-of-4
 3 2005-03-01 1st-of-4   3 2005-03-04 4th-of-4

 3 2005-03-02 2nd-of-4   3 2005-03-02 2nd-of-4
 3 2005-03-02 2nd-of-4   3 2005-03-03 3rd-of-4
 3 2005-03-02 2nd-of-4   3 2005-03-04 4th-of-4

 3 2005-03-03 3rd-of-4   3 2005-03-03 3rd-of-4
 3 2005-03-03 3rd-of-4   3 2005-03-04 4th-of-4

 3 2005-03-04 4th-of-4   3 2005-03-04 4th-of-4

Notice how each row of t1 is joined only to those rows of t2 for the same id where the t2 date is equal or later.

Next, let's change the query so that it counts the number of joined t2 rows instead of displaying them:

select t1.id
     , t1.adate
     , t1.description
     , count(*)
  from thetable as t1
inner
  join thetable as t2
    on t1.id = t2.id
   and t1.adate <= t2.adate
group
    by t1.id
     , t1.adate
     , t1.description

This query produces these results:

 2 2005-02-11 1st-of-5  5
 2 2005-02-12 2nd-of-5  4
 2 2005-02-13 3rd-of-5  3
 2 2005-02-14 4th-of-5  2
 2 2005-02-15 5th-of-5  1
 3 2005-03-01 1st-of-4  4
 3 2005-03-02 2nd-of-4  3
 3 2005-03-03 3rd-of-4  2
 3 2005-03-04 4th-of-4  1

Now all we have to do is restrict the returned rows based on the count being less than or equal to 2. In other words, we want t1 rows where there are only 1 or 2 t2 rows with a later date.

select t1.id
     , t1.adate
     , t1.description
  from thetable as t1
inner
  join thetable as t2
    on t1.id = t2.id
   and t1.adate <= t2.adate
group
    by t1.id
     , t1.adate
     , t1.description
having count(*) <= 2 

This query gives us our final results:

 2 2005-02-14 4th-of-5
 2 2005-02-15 5th-of-5
 3 2005-03-03 3rd-of-4
 3 2005-03-04 4th-of-4

This query will work in all databases, because it doesn't use proprietary syntax like TOP, LIMIT, or ROWNUM.


This was first published in March 2005

Dig deeper on Oracle and SQL

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchDataManagement

SearchBusinessAnalytics

SearchSAP

SearchSQLServer

TheServerSide

SearchDataCenter

SearchContentManagement

SearchFinancialApplications

Close