I have a table of dates and number of events per date. I need to find a 5-day range where the most number of events happened. I just couldn't figure out a way to sum up the values of events-per-date for every 5 rows. This means all possible 5 consecutive rows, but dates are not necessarily consecutive (days where no events occurred are not kept).
eventdate numevents 2006-06-16 3 2006-06-17 3 2006-06-18 3 2006-06-19 3 2006-06-20 4 2006-06-21 4 2006-06-22 4 2006-06-23 4 2006-06-24 3 2006-06-25 2 2006-06-26 2 2006-06-27 2 2006-06-30 2 2006-07-01 2 2006-07-04 1
This is only a very small section of the table.
Requires Free Membership to View
What an interesting problem. Thanks for sending it in.
The solution requires a self-join. What we need to do is join the table to itself, with a special condition: each row must be joined only to itself and the immediately preceeding four rows. In other words, we join each row to the top 5 rows that have an eventdate that is equal to or less than that row. Please make sure you understand this statement of the problem before looking at the following SQL:
select t1.eventdate
, t1.numevents
, t2.eventdate
, t2.numevents
from events as t1
inner
join events as t2
on t2.eventdate <= t1.eventdate
and 5 >=
( select count(*)
from events
where eventdate
between t2.eventdate
and t1.eventdate )
order
by t1.eventdate
, t2.eventdate
The above query produces the following results:
t1 t2 2006-06-16 3 2006-06-16 3 2006-06-17 3 2006-06-16 3 2006-06-17 3 2006-06-17 3 2006-06-18 3 2006-06-16 3 2006-06-18 3 2006-06-17 3 2006-06-18 3 2006-06-18 3 2006-06-19 3 2006-06-16 3 2006-06-19 3 2006-06-17 3 2006-06-19 3 2006-06-18 3 2006-06-19 3 2006-06-19 3 2006-06-20 4 2006-06-16 3 2006-06-20 4 2006-06-17 3 2006-06-20 4 2006-06-18 3 2006-06-20 4 2006-06-19 3 2006-06-20 4 2006-06-20 4 2006-06-21 4 2006-06-17 3 2006-06-21 4 2006-06-18 3 2006-06-21 4 2006-06-19 3 2006-06-21 4 2006-06-20 4 2006-06-21 4 2006-06-21 4 and so on
Notice that the first four t1 dates do not have five preceeding t2 dates. We'll come back to that in a moment.
Now that we know we're looking at the right data, we can do the next step, which is to add up the numevents values for the joined t2 rows. We do this by changing the query into a GROUP BY query:
select t1.eventdate
, count(*) as rows
, sum(t2.numevents) as total
from events as t1
inner
join events as t2
on t2.eventdate <= t1.eventdate
and 5 >=
( select count(*)
from events
where eventdate
between t2.eventdate
and t1.eventdate )
group
by t1.eventdate
The above query produces the following results:
eventdate rows total 2006-06-16 1 3 2006-06-17 2 6 2006-06-18 3 9 2006-06-19 4 12 2006-06-20 5 16 2006-06-21 5 17 2006-06-22 5 18 2006-06-23 5 19 2006-06-24 5 19 2006-06-25 5 17 2006-06-26 5 15 2006-06-27 5 13 2006-06-30 5 11 2006-07-01 5 10 2006-07-04 5 9
Notice that we can add HAVING COUNT(*)=5 to this query if we wish to eliminate the first four dates.
The final step is to take the maximum of the total column, and there are a number of ways to do it, which we don't really have room to discuss. The easiest is to add ORDER BY total DESC to the above query, and, by inspection, take the first result row. Caution: look for ties for the maximum, as there are in this case for the 23rd and 24th.
This was first published in October 2006

Join the conversationComment
Share
Comments
Results
Contribute to the conversation