Code updated on July 31, 2002.
Here is some SQL code to compare nodes and structures of trees in the Nested Sets model for hierarchies. Rudy Limeback replied to a question
Requires Free Membership to View
There are really several kinds of equality comparisons when you are dealing with a hierarchy:
- Same nodes in both tables
- Same structure in both tables
- Same nodes and structure in both tables -- they are identical
Let me once more invoke my semi-quasi-famous organization chart in the nested sets model. If you do not understand the nested sets model, then read the earlier articles here and here:
CREATE TABLE Personnel (emp CHAR(10) NOT NULL PRIMARY KEY, lft INTEGER NOT NULL UNIQUE CHECK (lft > 0), rgt INTEGER NOT NULL UNIQUE CHECK (rgt > 1), CONSTRAINT order_okay CHECK (lft < rgt) );
Insert sample data:
Personnel emp lft rgt ====================== 'Albert' 1 12 'Bert' 2 3 'Chuck' 4 11 'Donna' 5 6 'Eddie' 7 8 'Fred' 9 10
The organizational chart would look like this as a directed graph:
Albert (1,12)
/ \
/ \
Bert (2,3) Chuck (4,11)
/ | \
/ | \
/ | \
/ | \
Donna (5,6) Eddie (7,8) Fred (9,10)
Case one: Same nodes
Let's create a second table with the same nodes, but with a different structure:
CREATE TABLE Personnel_2 (emp CHAR(10) NOT NULL PRIMARY KEY, lft INTEGER NOT NULL UNIQUE CHECK (lft > 0), rgt INTEGER NOT NULL UNIQUE CHECK (rgt > 1), CONSTRAINT order_okay CHECK (lft < rgt) );
Insert sample data:
Personnel_2
emp lft rgt
======================
'Albert' 1 12
'Bert' 2 3
'Chuck' 4 5
'Donna' 6 7
'Eddie' 8 9
'Fred' 10 11
Albert(1,12)
|
------------------------------------------
| | | | |
Bert(2,3) Chuck(4,5) Donna(6,7) Eddie(8,9) Fred(10,11)
Rudy gave a few different ways of matching nodes between Personnel and Personnel_2. Here is one way that is a little different:
SELECT 'They have the same nodes'
FROM Personnel_2 AS P0
FULL OUTER JOIN
Personnel_3 AS P1
ON P0.emp = P1.emp
HAVING (SELECT COUNT(*) FROM Personnel_2)
= (SELECT COUNT(*) FROM Personnel_3)
AND (SELECT COUNT(*) FROM Personnel_2)
= COUNT(*)
AND (SELECT COUNT(*) FROM Personnel_3)
= COUNT(*);
In fairness, a FULL OUTER JOIN is hard to find in many SQL products, even today. The idea is that anyone without a match in the other table will get generated NULLs and become a row by himself for the count.
Case two: Same structure
I will not bother with an ASCII drawing, but let's present a table with sample data that has different people inside the same structure.
Personnel_3 emp lft rgt ====================== 'Amber' 1 12 'Bobby' 2 3 'Charles' 4 11 'Donald' 5 6 'Edward' 7 8 'Frank' 9 10
If these things are really separate trees, then life is easy. You do the same join as before but with the (lft,rgt) pairs:
SELECT DISTINCT 'They have the same structure'
FROM Personnel AS P0
WHERE COUNT(P0.emp)
= (SELECT COUNT(*)
FROM Personnel AS P0
FULL OUTER JOIN
Personnel_3 AS P1
ON P0.lft = P1.lft
AND P0.rgt = P1.rgt);
Case three: Same nodes and same structure
This is not a big jump to simply extend the predicate:
SELECT DISTINCT 'They are identical'
FROM Personnel AS P0
WHERE COUNT(P0.emp)
= (SELECT COUNT(*)
FROM Personnel AS P0
FULL OUTER JOIN
Personnel_2 AS P1
ON P0.lft = P1.lft
AND P0.rgt = P1.rgt
AND P0.emp = P1.emp);
But more often than not, you will be comparing subtrees within the same tree. This is best handled by putting the two sub-trees into a canonical form. First you need the root node (i.e the emp in this sample data) and then you can re-number the (lft, rgt) pairs with a derived table of this form:
(SELECT emp,
lft - (SELECT MIN(lft FROM Personnel),
rgt - (SELECT MIN(lft FROM Personnel)
FROM Personnel AS P0
WHERE P0.lft
BETWEEN (SELECT lft
FROM Personnel
WHERE emp = :emp_root_subtree_2)
AND (SELECT rgt
FROM Personnel
WHERE emp = :emp_root_subtree_2))
AS CP0 (emp, lft, rgt)
The derived table with have its (lft,rgt) pairs start at zero instead of one. That is actually an advantage when you insert a canonical tree structure into an existing table, but that is another article.
If you would like to see this all in one monster query:
SELECT CASE COUNT(P.emp)
WHEN (SELECT COUNT(*)
FROM Personnel AS P0
FULL OUTER JOIN
Personnel_3 AS P1
ON P0.lft = P1.lft
AND P0.rgt = P1.rgt
AND P0.emp = P1.emp)
THEN 'The trees are identical'
WHEN (SELECT COUNT(*)
FROM Personnel AS P0
FULL OUTER JOIN
Personnel_2 AS P1
ON P0.emp = P1.emp)
THEN 'The trees have the same nodes'
WHEN (SELECT COUNT(*)
FROM Personnel AS P0
FULL OUTER JOIN
Personnel_2 AS P1
ON P0.lft = P1.lft
AND P0.rgt = P1.rgt)
THEN 'The trees have the same structure'
ELSE 'The trees have different node sets and structure'
FROM Personnel AS P;
This depends on the CASE expression operating from left to right and returning the first TRUE clause it finds. Since there will a lot of indexes on the tree tables, joins and aggregate functions should be relatively fast to perform. For an exercise, try writing the same structure queries with the adjacency list model.
About the Author
Joe Celko is author of SQL for Smarties: Advanced SQL Programming (Morgan-Kaufmann, 1999).
For More Information
- What do you think about this tip? E-mail us at tdichiara@techtarget.com with your feedback.
- The Best SQL Web Links: tips, tutorials, scripts, and more.
- Have an SQL tip to offer your fellow DBA's and developers? The best tips submitted will receive a cool prize--submit your tip today!
- Ask your technical SQL questions--or help out your peers by answering them--in our live discussion forums.
- Ask the Experts yourself: Our SQL gurus are waiting to answer your toughest questions.
This was first published in March 2002

Join the conversationComment
Share
Comments
Results
Contribute to the conversation