|
Consider the following data:
id adate description
2 2005-02-11 1st-of-5
2 2005-02-12 2nd-of-5
2 2005-02-13 3rd-of-5
2 2005-02-14 4th-of-5
2 2005-02-15 5th-of-5
3 2005-03-01 1st-of-4
3 2005-03-02 2nd-of-4
3 2005-03-03 3rd-of-4
3 2005-03-04 4th-of-4
Now let's set up a theta join. A theta join uses an
inequality in the join condition. The join will be a self-join,
and each row will be joined to all rows for the same id which
have an equal or later date.
select t1.id
, t1.adate
, t1.description
, t2.id
, t2.adate
, t2.description
from thetable as t1
inner
join thetable as t2
on t1.id = t2.id
and t1.adate <= t2.adate
order
by t1.id
, t1.adate
, t2.adate
The above query produces these results:
2 2005-02-11 1st-of-5 2 2005-02-11 1st-of-5
2 2005-02-11 1st-of-5 2 2005-02-12 2nd-of-5
2 2005-02-11 1st-of-5 2 2005-02-13 3rd-of-5
2 2005-02-11 1st-of-5 2 2005-02-14 4th-of-5
2 2005-02-11 1st-of-5 2 2005-02-15 5th-of-5
2 2005-02-12 2nd-of-5 2 2005-02-12 2nd-of-5
2 2005-02-12 2nd-of-5 2 2005-02-13 3rd-of-5
2 2005-02-12 2nd-of-5 2 2005-02-14 4th-of-5
2 2005-02-12 2nd-of-5 2 2005-02-15 5th-of-5
2 2005-02-13 3rd-of-5 2 2005-02-13 3rd-of-5
2 2005-02-13 3rd-of-5 2 2005-02-14 4th-of-5
2 2005-02-13 3rd-of-5 2 2005-02-15 5th-of-5
2 2005-02-14 4th-of-5 2 2005-02-14 4th-of-5
2 2005-02-14 4th-of-5 2 2005-02-15 5th-of-5
2 2005-02-15 5th-of-5 2 2005-02-15 5th-of-5
3 2005-03-01 1st-of-4 3 2005-03-01 1st-of-4
3 2005-03-01 1st-of-4 3 2005-03-02 2nd-of-4
3 2005-03-01 1st-of-4 3 2005-03-03 3rd-of-4
3 2005-03-01 1st-of-4 3 2005-03-04 4th-of-4
3 2005-03-02 2nd-of-4 3 2005-03-02 2nd-of-4
3 2005-03-02 2nd-of-4 3 2005-03-03 3rd-of-4
3 2005-03-02 2nd-of-4 3 2005-03-04 4th-of-4
3 2005-03-03 3rd-of-4 3 2005-03-03 3rd-of-4
3 2005-03-03 3rd-of-4 3 2005-03-04 4th-of-4
3 2005-03-04 4th-of-4 3 2005-03-04 4th-of-4
Notice how each row of t1 is joined only to those rows
of t2 for the same id where the t2 date is equal or later.
Next, let's change the query so that it counts the number
of joined t2 rows instead of displaying them:
select t1.id
, t1.adate
, t1.description
, count(*)
from thetable as t1
inner
join thetable as t2
on t1.id = t2.id
and t1.adate <= t2.adate
group
by t1.id
, t1.adate
, t1.description
This query produces these results:
2 2005-02-11 1st-of-5 5
2 2005-02-12 2nd-of-5 4
2 2005-02-13 3rd-of-5 3
2 2005-02-14 4th-of-5 2
2 2005-02-15 5th-of-5 1
3 2005-03-01 1st-of-4 4
3 2005-03-02 2nd-of-4 3
3 2005-03-03 3rd-of-4 2
3 2005-03-04 4th-of-4 1
Now all we have to do is restrict the returned rows based on the count
being less than or equal to 2. In other words, we want t1 rows
where there are only 1 or 2 t2 rows with a later date.
select t1.id
, t1.adate
, t1.description
from thetable as t1
inner
join thetable as t2
on t1.id = t2.id
and t1.adate <= t2.adate
group
by t1.id
, t1.adate
, t1.description
having count(*) <= 2
This query gives us our final results:
2 2005-02-14 4th-of-5
2 2005-02-15 5th-of-5
3 2005-03-03 3rd-of-4
3 2005-03-04 4th-of-4
This query will work in all databases, because it doesn't use
proprietary syntax like TOP, LIMIT, or ROWNUM.
|