To continue reading for free, register below or login
To read more you must become a member of SearchOracle.com
');
// -->

These are called "runs" or "sequences" and the SQL can sure be tricky.
In this table, the MarketID column provides the inherent ordering of data.
Image the IDs as points on a line:
------o------o------o------o------
Now we will self-join this table:
from Market_XY as t1
inner
join Market_XY as t3
on t3.MarketID > t1.MarketID
and t3.XY = t1.XY
We will pair up each row (t1) to all other rows (t3) that
have a higher ID as well as the same XY. The same XY means these
two rows are a potential run:
------o------o------o------o------
| | | |
t1 t3---->t3---->t3---->
Next, we self-join once more, but with a LEFT OUTER JOIN, using an IS NULL
condition in the WHERE clause to make sure no matches are found:
from Market_XY as t1
inner
join Market_XY as t3
on t3.MarketID > t1.MarketID
and t3.XY = t1.XY
left outer
join Market_XY as t2
on t2.MarketID between t1.MarketID and t3.MarketID
and t2.XY <> t3.XY
where t2.MarketID is null
Essentially, we want all t1-t3 runs, where there is no t2 between
them with a different XY. The LEFT OUTER JOIN looks for them, but
the IS NULL keeps only the t1-t3 pairs where no t2 exists.
But obviously there will be overlaps of runs. In the given sample data,
notice that the run (1,2,3) of Xs, has runs (1,2) and (2,3) inside it.
What remains now is simply to take the longest runs. This involves
careful grouping and the use of both MIN() and MAX().
select XY
, hiID - min(loID) + 1 as Count
from (
select t1.MarketID as loID
, t1.XY
, max(t3.MarketID) as hiID
from Market_XY as t1
inner
join Market_XY as t3
on t3.MarketID > t1.MarketID
and t3.XY = t1.XY
left outer
join Market_XY as t2
on t2.MarketID
between t1.MarketID
and t3.MarketID
and t2.XY <> t3.XY
where t2.MarketID is null
group
by t1.MarketID
, t1.XY
) as hiIDs
group
by XY
, hiID
order
by hiID
It looks tricky but it really isn't. To understand it, run the subquery
alone, without the MAX() or GROUP BY, but showing the t3 columns. Then see
what happens with the MAX() and GROUP BY. Then apply the outer query with
its GROUP BY and MIN(). Note that MAX - MIN + 1 gives the count for the run.
P.S. If anyone has an analytic SQL solution for this,
please do send it in. I would love to see it.
|