Ask the Expert

Top N subcategories, Part 2

I want a query to summarize categories of information so that the top N subcategories within each major category are displayed. All category codes and associated elements are in the same table.

    Requires Free Membership to View

In Part 1 of this answer, we created the category table and loaded it with sample data. This part continues by discussing options for querying and presenting the contents of the hierarchical category table.

The first improvement on the query shown in Part 1 will be to sort the rows by their parentid:

select * from categories order by coalesce(parentid,id), descr
id parentid descr salesamt ------ -------- --------------- ----------- 1 NULL Category A NULL 5 1 subcat A1 103 6 1 subcat A2 207 7 1 subcat A3 94 8 1 subcat A4 255 9 1 subcat A5 159 2 NULL Category B NULL 16 2 subcat B1 315 15 2 subcat B2 315 14 2 subcat B3 377 13 2 subcat B4 301 12 2 subcat B5 315 3 NULL Category C NULL 4 NULL Category D NULL 10 4 subcat D1 150 11 4 subcat D2 195

Let's take a moment to look at the COALESCE in the ORDER BY. (This wonderful trick is the main reason why this answer is so long, and had to be broken up into parts!)

Notice that when parentid is null, i.e. for categories at the top level, the COALESCE function substitutes the id. So parent categories are sorted by their own ids. But when the parentid is not null, then it's a subcategory, so the subcategories are sorted by their parent categories. And since NULLs usually sort first, the overall result is that the subcategories are sorted under their parent categories. Neat, eh?

Furthermore, the second ORDER BY column ensures that subqueries are sorted by DESCR within their parent categories. If you examine the ids of the subcategories, you'll see that the subcategories under B were entered last, in reverse order.

But what if we didn't load our top level categories in the right sequence, and we wanted to sort by category and subcategory DESCR sequence? The only way to have both DESCR values available for sorting at the same time (i.e. on the same result set row) is to do a join.

select cat.id , cat.descr , subcat.id , subcat.descr , subcat.salesamt from categories cat left outer join categories subcat on cat.id = subcat.parentid where cat.parentid is null order by cat.descr , subcat.descr
id descr id descr salesamt ------ --------------- ------ --------------- ----------- 1 Category A 5 subcat A1 103 1 Category A 6 subcat A2 207 1 Category A 7 subcat A3 94 1 Category A 8 subcat A4 255 1 Category A 9 subcat A5 159 2 Category B 16 subcat B1 315 2 Category B 15 subcat B2 315 2 Category B 14 subcat B3 377 2 Category B 13 subcat B4 301 2 Category B 12 subcat B5 315 3 Category C NULL NULL NULL 4 Category D 10 subcat D1 150 4 Category D 11 subcat D2 195

This query uses a LEFT OUTER join so that top level categories like C which have no subcategories are included.

In Part 3 of this answer, we will move on to the actual problem, selecting the "top N" subcategories in each category.


This was first published in October 2003

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: