I'd like to get number for composing a graph of sales. Considering we have a lot of items in our store, I'd like to visualize just the nth biggest values and aggregate all the others. For example: If I have 100 items, I want to pull values greater than $1,000. All others would be aggregated so that I can get something like this:

  1. Product A: $10,000
  2. Product B: $12,000
  3. Product C: $8,000
  4. Other: $18,000 (the sum of smaller values)

How can I do that?

    Requires Free Membership to View

This actually sounds like two SQL SELECT queries to me. In the first place, you'd like to aggregate by product whose aggregate sales values exceed $1,000. On the other hand, you'd like to aggregate as Other all of those products whose aggregate values are $1,000 or less. So, I would write and test each query separately, then combine them with UNION ALL.

For example:

-- Top N Sales > 1000
SELECT ps.product_name, SUM(ps.sales) sales
FROM product_sales ps
GROUP BY ps.product_name
HAVING SUM(ps.sales) > 1000;

-- everything else
SELECT ps.product_name, SUM(ps.sales) sales
FROM product_sales ps
GROUP BY ps.product_name
HAVING SUM(ps.sales) <= 1000;

You probably have certain filter criteria, such as only for the current calendar year or some fiscal year or quarter, perhaps. Since you'd like everything else to be aggregated under Other, you can simply do this with that SQL SELECT query:

-- everything else, aggregated as Other
SELECT 'Other' product_name, SUM(tps.sales) sales
FROM (SELECT ps.product_name, SUM(ps.sales) sales      
FROM product_sales ps      
GROUP BY ps.product_name      
HAVING SUM(ps.sales) <= 1000      
) tps
GROUP BY 'Other';

More on SQL SELECT statements

Check out how to use a SQL SELECT statement on a nonexistent table

Find out how to debug poor response time when running a SQL SELECT statement

Discover how to translate information requests into SQL SELECT statements

Note that the GROUP BY has to match the scalar (nonaggregated) columns given in the SELECT clause. In this case, we're discarding the actual product names and aggregating them all under a literal string instead.

Now, you can combine the two data sets and sort them by descending sales using SQL UNION ALL:

-- Top N Sales > 1000
SELECT ps.product_name, SUM(ps.sales) sales
FROM product_sales ps
GROUP BY ps.product_name
HAVING SUM(ps.sales) > 1000
UNION ALL

-- everything else, aggregated as "Other"
SELECT 'Other' product_name, SUM(tps.sales) sales
FROM (SELECT ps.product_name, SUM(ps.sales) sales      
    FROM product_sales ps      
    GROUP BY ps.product_name      
    HAVING SUM(ps.sales) <= 1000      
    ) tps
GROUP BY 'Other'
ORDER BY sales DESC, product_name;

If you want Other to always wind up at the bottom of the sorted set (in the case where otherwise all of the other small aggregates would wind up being bigger than the larger individual aggregates), then the ORDER BY can use an expression like the following:

ORDER BY DECODE(product_name, 'Other', -sales, sales) DESC, product_name;

This would always make just the one Other row sort to the bottom of the list, assuming aggregate sales are never negative. I included product_name as the sort tiebreaker in the probably infrequent case where two products' aggregate sales are identical. It's probably no big deal, but it's good to write queries in such a way as to guarantee consistent behavior.

This was first published in November 2012

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: