Problem solve Get help with specific problems with your technologies, process and projects.

Top N subcategories, Part 2

I want a query to summarize categories of information so that the top N subcategories within each major category are displayed. All category codes and associated elements are in the same table.

In Part 1 of this answer, we created the category table and loaded it with sample data. This part continues by discussing options for querying and presenting the contents of the hierarchical category table.

The first improvement on the query shown in Part 1 will be to sort the rows by their parentid:

select * from categories order by coalesce(parentid,id), descr
id parentid descr salesamt ------ -------- --------------- ----------- 1 NULL Category A NULL 5 1 subcat A1 103 6 1 subcat A2 207 7 1 subcat A3 94 8 1 subcat A4 255 9 1 subcat A5 159 2 NULL Category B NULL 16 2 subcat B1 315 15 2 subcat B2 315 14 2 subcat B3 377 13 2 subcat B4 301 12 2 subcat B5 315 3 NULL Category C NULL 4 NULL Category D NULL 10 4 subcat D1 150 11 4 subcat D2 195

Let's take a moment to look at the COALESCE in the ORDER BY. (This wonderful trick is the main reason why this answer is so long, and had to be broken up into parts!)

Notice that when parentid is null, i.e. for categories at the top level, the COALESCE function substitutes the id. So parent categories are sorted by their own ids. But when the parentid is not null, then it's a subcategory, so the subcategories are sorted by their parent categories. And since NULLs usually sort first, the overall result is that the subcategories are sorted under their parent categories. Neat, eh?

Furthermore, the second ORDER BY column ensures that subqueries are sorted by DESCR within their parent categories. If you examine the ids of the subcategories, you'll see that the subcategories under B were entered last, in reverse order.

But what if we didn't load our top level categories in the right sequence, and we wanted to sort by category and subcategory DESCR sequence? The only way to have both DESCR values available for sorting at the same time (i.e. on the same result set row) is to do a join.

select cat.id , cat.descr , subcat.id , subcat.descr , subcat.salesamt from categories cat left outer join categories subcat on cat.id = subcat.parentid where cat.parentid is null order by cat.descr , subcat.descr
id descr id descr salesamt ------ --------------- ------ --------------- ----------- 1 Category A 5 subcat A1 103 1 Category A 6 subcat A2 207 1 Category A 7 subcat A3 94 1 Category A 8 subcat A4 255 1 Category A 9 subcat A5 159 2 Category B 16 subcat B1 315 2 Category B 15 subcat B2 315 2 Category B 14 subcat B3 377 2 Category B 13 subcat B4 301 2 Category B 12 subcat B5 315 3 Category C NULL NULL NULL 4 Category D 10 subcat D1 150 4 Category D 11 subcat D2 195

This query uses a LEFT OUTER join so that top level categories like C which have no subcategories are included.

In Part 3 of this answer, we will move on to the actual problem, selecting the "top N" subcategories in each category.

Dig Deeper on Oracle and SQL

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.