Requires Free Membership to View
Consider the following data:
id adate description 2 2005-02-11 1st-of-5 2 2005-02-12 2nd-of-5 2 2005-02-13 3rd-of-5 2 2005-02-14 4th-of-5 2 2005-02-15 5th-of-5 3 2005-03-01 1st-of-4 3 2005-03-02 2nd-of-4 3 2005-03-03 3rd-of-4 3 2005-03-04 4th-of-4
Now let's set up a theta join. A theta join uses an inequality in the join condition. The join will be a self-join, and each row will be joined to all rows for the same id which have an equal or later date.
select t1.id
, t1.adate
, t1.description
, t2.id
, t2.adate
, t2.description
from thetable as t1
inner
join thetable as t2
on t1.id = t2.id
and t1.adate <= t2.adate
order
by t1.id
, t1.adate
, t2.adate
The above query produces these results:
2 2005-02-11 1st-of-5 2 2005-02-11 1st-of-5 2 2005-02-11 1st-of-5 2 2005-02-12 2nd-of-5 2 2005-02-11 1st-of-5 2 2005-02-13 3rd-of-5 2 2005-02-11 1st-of-5 2 2005-02-14 4th-of-5 2 2005-02-11 1st-of-5 2 2005-02-15 5th-of-5 2 2005-02-12 2nd-of-5 2 2005-02-12 2nd-of-5 2 2005-02-12 2nd-of-5 2 2005-02-13 3rd-of-5 2 2005-02-12 2nd-of-5 2 2005-02-14 4th-of-5 2 2005-02-12 2nd-of-5 2 2005-02-15 5th-of-5 2 2005-02-13 3rd-of-5 2 2005-02-13 3rd-of-5 2 2005-02-13 3rd-of-5 2 2005-02-14 4th-of-5 2 2005-02-13 3rd-of-5 2 2005-02-15 5th-of-5 2 2005-02-14 4th-of-5 2 2005-02-14 4th-of-5 2 2005-02-14 4th-of-5 2 2005-02-15 5th-of-5 2 2005-02-15 5th-of-5 2 2005-02-15 5th-of-5 3 2005-03-01 1st-of-4 3 2005-03-01 1st-of-4 3 2005-03-01 1st-of-4 3 2005-03-02 2nd-of-4 3 2005-03-01 1st-of-4 3 2005-03-03 3rd-of-4 3 2005-03-01 1st-of-4 3 2005-03-04 4th-of-4 3 2005-03-02 2nd-of-4 3 2005-03-02 2nd-of-4 3 2005-03-02 2nd-of-4 3 2005-03-03 3rd-of-4 3 2005-03-02 2nd-of-4 3 2005-03-04 4th-of-4 3 2005-03-03 3rd-of-4 3 2005-03-03 3rd-of-4 3 2005-03-03 3rd-of-4 3 2005-03-04 4th-of-4 3 2005-03-04 4th-of-4 3 2005-03-04 4th-of-4
Notice how each row of t1 is joined only to those rows of t2 for the same id where the t2 date is equal or later.
Next, let's change the query so that it counts the number of joined t2 rows instead of displaying them:
select t1.id
, t1.adate
, t1.description
, count(*)
from thetable as t1
inner
join thetable as t2
on t1.id = t2.id
and t1.adate <= t2.adate
group
by t1.id
, t1.adate
, t1.description
This query produces these results:
2 2005-02-11 1st-of-5 5 2 2005-02-12 2nd-of-5 4 2 2005-02-13 3rd-of-5 3 2 2005-02-14 4th-of-5 2 2 2005-02-15 5th-of-5 1 3 2005-03-01 1st-of-4 4 3 2005-03-02 2nd-of-4 3 3 2005-03-03 3rd-of-4 2 3 2005-03-04 4th-of-4 1
Now all we have to do is restrict the returned rows based on the count being less than or equal to 2. In other words, we want t1 rows where there are only 1 or 2 t2 rows with a later date.
select t1.id
, t1.adate
, t1.description
from thetable as t1
inner
join thetable as t2
on t1.id = t2.id
and t1.adate <= t2.adate
group
by t1.id
, t1.adate
, t1.description
having count(*) <= 2
This query gives us our final results:
2 2005-02-14 4th-of-5 2 2005-02-15 5th-of-5 3 2005-03-03 3rd-of-4 3 2005-03-04 4th-of-4
This query will work in all databases, because it doesn't use proprietary syntax like TOP, LIMIT, or ROWNUM.
This was first published in March 2005

Join the conversationComment
Share
Comments
Results
Contribute to the conversation