How do you represent a recursive relationship using SQL?
This will be a brief overview only. You can find good Web tutorials about recursive relationships, using search terms like: recursive relationship, reflexive relationship, adjacency model, nested hierarchy, or tree structure.
The basic idea is that each entity has subentities. For example, a category can have subcategories, a subcategory can have sub-subcategories, and so on. Employees can have other employees reporting to them. Directories or folders, like on your C drive, can contain other folders (except this analogy is messy because folders also contain files).
The important characteristics of a tree or hierarchy structure are:
1. The relationship is one-to-many, not many-to-many
This is important. If it's a many-to-many relationship, the tree structure won't work. Each subcategory belongs to only one category, each folder exists within only one containing folder, and each employee can report to only one other employee. A many-to-many relationship requires a more complex structure, which we will not examine here.
2. There is only one table
This is not immediately obvious. I have seen people declare a Category table, and then a separate Subcategory table, then a Subsubcategory table, and so on. Eventually, they stop, but not without wondering what they'll do if their data ever has one more level than they have declared tables. In the tree structure, there's only one table, and it can hold as many levels as desired.
3. Each sublevel "points to" its parent level
The rows in a tree or hierarchy table are related using foreign keys, like any other one-to-many relationship. Just remember that the child always points to its parent. If you did it the other way around, you would need to maintain a list of child keys in each parent, and that violates first normal form.
4. Each entity at the top of its hierarchy has a parent foreign key which is NULL
In the employees example, there is usually only one tree, with the CEO at the top, or root. In the categories example, there are usually several roots, the top level categories. Because these rows have no parent, their foreign keys are set to NULL.
The following SQL declares a table and a reflexive relationship for the categories of tasks involved in a systems project.
create table ProjectCategories ( catid integer not null , catdesc varchar(25) not null , catparent integer null , primary key (catid) , foreign key (catparent) references ProjectCategories (catid) )
Here are sample rows, meant only as an illustration, and not an exhaustive list of project categories.
catid catdesc catparent 1 Analysis -- 2 Requirements 1 3 Scope 1 4 Objectives 3 5 Constraints 3 6 Design -- 7 Conceptual 6 8 Data Model 7 9 Business Rules 7 10 Logical 6 11 Development -- 12 Database 11 13 Application 11 14 Implementation -- 15 Conversion 14 16 Maintenance --
These rows exhibit their hierarchy or tree structure when displayed with indentation:
Analysis Requirements Scope Objectives Constraints Design Conceptual Data Model Business Rules Logical Development Database Application Implementation Conversion Maintenance
Note that sequencing of categories, and subcategories within categories, is best achieved with an additional column.