Could you please explain to me the DB2 query given below?? It is a self join, but the part which I cannot understand is how it manages to generate sequential order numbers with the help of self join and where clause.
SELECT count(*) + 1000, A.SLS_TM_NBR, A.FFO_TYP_CD, A.FFO_NBR FROM SP1SLADB.VWK_ORD_GEN4 A, SP1SLADB.VWK_ORD_GEN4 B WHERE ((A.SLS_TM_NBR > B.SLS_TM_NBR) OR (A.SLS_TM_NBR = B.SLS_TM_NBR AND ?? A.FFO_TYP_CD > B.FFO_TYP_CD) OR (A.SLS_TM_NBR = B.SLS_TM_NBR AND ?? A.FFO_TYP_CD = B.FFO_TYP_CD AND ?? A.FFO_NBR >= B.FFO_NBR)) GROUP BY A.SLS_TM_NBR, A.FFO_TYP_CD, A.FFO_NBR
Don't let the complex join condition throw you. This self-join simply counts rows which are "less than" the row in question. This type of join is not an equi-join, where rows are matched that have equal values in some key. Rather, each row is joined to all the rows that satisfy the inequality. This is called a theta join even though few people use this terminology. (I just call it a "less than" join.) With each row joined to all its lesser rows, the GROUP BY generates a COUNT(*) which simply indicates how many lesser rows each row has.
One example where this type of join is useful is in determining a ranking. Imagine a table like this --
theID theValue a 6 b 8 c 12 d 7 e 11 f 15 g 13 h 21 i 9
Running a simple query with an ORDER BY will get you the records in the right sequence, but using a theta self-join will generate rankings. Remember that the ranking is actually just the number of lesser rows --
select A.theValue , count(*) as theRank from theTable as A , theTable as B where A.theValue >= B.theValue group by A.theValue
theValue theRank 6 1 7 2 8 3 9 4 11 5 12 6 13 7 15 8 21 9
Notice that the results are ranked from smallest to largest. This would be the case when ranking golf scores or demerit points, say. For other queries, where the top rank is the largest, the theta condition would be a "greater than" comparison instead of "less than." Also, notice I had to use "less than or equal to" to generate a result row for every row in the A table. Otherwise, the lowest row is left out, as there are no rows lower than the lowest.
Warning: the above query will work correctly only if there are no "ties." If you have ties in your data, the ranking query is substantially more complex.
So, given that the theta join conveniently produces a number, which may or may not be a ranking, but which in your example was further massaged by adding three zeroes to it, is this a good way to generate order numbers? The answer is maybe. It is not a good method if there are ties, because then there will be more than one row with the same number of lesser rows, and hence the same order number!! For your own comfort, double-check to make sure there are UNIQUE constraints on the table columns that are used in the WHERE clause.