|
There is a technique in SQL which uses a self-join to match
each row to the next row. With this type of query,
you can produce data such as rank or position, row differences,
running totals, and so on.
Within this self-join, it is necessary to specify a join
condition. You cannot just join each row to all other rows
(which would be a cross join). Specifically, you want each row to
join to the next row.
This is easy if you know which row is the next row. Clearly,
this can be done only based on the values in one
or more columns, since, as we all know, there is no such thing
as position of rows within a relational database table. Rows
are stored anywhere. Sequencing can be achieved only using the
ORDER BY clause in SELECTs. Sequencing is also implicit
in the inequality comparison operators.
For example, consider a self-join based on one row having a
greater value in the join column than the other:
select ...
from daTable as r1
left outer
join daTable as r2
on r2.foo > r1.foo
Here, each row (r1) is joined to all rows (r2)
which have a higher value in the foo column. This is a perfectly
valid self-join.
This is not yet the complete solution, though, because what we really
want to do is join each row to only one other row that has a higher
value, namely the lowest of these, i.e. only the next row. It should
be no surprise that this will be done with a subquery that uses MIN().
However, this is a good time to stop working on the solution
and look back at the actual question in this example. If the middle
column is the data being accumulated, and the
right column is the running total, this leaves the left column
as the only possible column that we can use for the self-join's ON clause.
And sadly, the left column isn't in sequence. The sample data is deficient.
What? Is that it? Is this the end of the story of the running total
self-join? No. Please stay tuned. There is more to come, in a future
answer.
|