To continue reading for free, register below or login
To read more you must become a member of SearchOracle.com
');
// -->

This is a great example of a requirement to return a
subtree. (Please
refer to this Wikipedia page if you do not recognize terms like
subtree, node, or level.)
Suppose the given eid is 2. So we are looking for all nodes that
are in the subtree of 2. Here is one solution which uses nested subqueries:
select eid
, ename
, mgrid
from employees as t1
where mgrid = 2
or exists
( select *
from employees as t2
where eid = t1.mgrid
and (
mgrid = 2
or exists
( select *
from employees as t3
where eid = t2.mgrid
and mgrid = 2
)
)
)
In effect, the query returns all employees:
- whose manager is 2, or
- whose manager's manager is 2, or
- whose manager's manager's manager is 2
There are many approaches to subtree problems, and this one certainly
appears straightforward enough. However, our choice
of solution should always be governed by practical
considerations.
Is this a practical solution? Absolutely, yes. We have only four
levels of data. Therefore, the query needs to "walk down"
at most three levels from any given node. Try the query
with the given node 1 to confirm.
But what if there are actually five levels of data and not four?
Will the query still work?
Of course the query will still work! <grin> It
will correctly return complete subtrees when there are one, two,
or three levels below the given node. However, if a given node has more
than three levels below it, only the first three levels down from that
node will be returned.
What to do, what to do? Obviously, one answer is simply to
extend the nested subquery pattern in the query to an additional level. Practical?
You betcha!
But what if there actually is an unknown number of levels of data
and not five? What then?
For the employee hierarchy, in the real world there is no
such thing as an unknown number of levels. Ask the HR department.
Then extend the query as many levels as necessary, plus one or two just
to be safe. The number of levels is not really the issue here.
Instead, it is the more practical consideration of
whether umpteen levels of data producing an exponential bazillion rows
of output is really of any use to anyone. Perhaps limiting the
results to four levels down isn't such a bad idea. See also
Categories and Subcategories
for additional thoughts and query examples.
The employee hierarchy is an example of the
adjacency list model, recognized by its
use of a "parent id" (in this case mgrid).
For other data structures, where there really is an unknown
number of levels, consider using a different data model.
See Trees in SQL by Joe Celko, which describes the
nested set model,
an alternative to the adjacency list model.
|