Problem solve Get help with specific problems with your technologies, process and projects.

Top 3 counts, revisited

In "Top 3 counts querying two tables" (27 June 2005), there is a potential trap.

In Top 3 counts querying two tables (27 June 2005), there is a potential trap. Actually it can exist in any "top" type report. Let's say that you are going to pay commission bonuses to the top three sales performers this month out of 100 reps. If a query is written just to get the top three, you might find that Rep11 = $100,000, Rep13 = $98,500, Rep28 = $97,120 and Rep41 = $97,120. All of the other reps are below $97,000. However, just retrieving the top three will probably miss the fourth (equally deserving) rep. I have run into this in more than one "top rank" type report. It might be wise to check that no TIES exist. I believe that Oracle has a ties function that allows for just this circumstance. In MySQL (version 3.23.xx) I usually run a minimal test query with the limit, i.e. LIMIT 3, capture the lowest value, and then run the actual query using HAVING as the restriction. You have been very helpful to me in the past and I really appreciate your insights.

You are right to be concerned about ties. However, the problem you mention happens only if you're using (one version of) TOP, or LIMIT. The queries given in my previous answer actually do take ties into consideration.

As far as using TOP or LIMIT is concerned, I'm not sure about Oracle, but in Microsoft Access, ties are automatically included in TOP queries, and in Microsoft SQL Server, you can say TOP n WITH TIES. In MySQL, LIMIT works strictly on rows, not on values. But all of this is proprietary syntax, and standard SQL would be much preferable.

More specifically, let's have a look at the generic top 10 query which was given in FIRST N rows, TOP N rows, LAST N rows, BOTTOM N rows...:

select foo
     , bar
  from yourtable X 
 where ( select count(*) 
           from yourtable  
          where foo > X.foo ) < 10

Let's see if we can adapt the generic SQL to your example. We'll add a couple more rows for demonstration purposes:

create table repsales
( repno varchar(7)
, sales integer
insert into repsales values ( 'Rep13',  98500 ) 
insert into repsales values ( 'Rep28',  97120 ) 
insert into repsales values ( 'Rep11', 100000 ) 
insert into repsales values ( 'Rep41',  97120 ) 
insert into repsales values ( 'Rep9',   90937 )
insert into repsales values ( 'Rep37',  91937 )

Now let's try the generic query:

select repno
     , sales
  from repsales X 
 where ( select count(*) 
           from repsales  
          where sales > X.sales ) < 3
    by sales desc 


repno   sales       
------- ----------- 
Rep11   100000
Rep13   98500
Rep28   97120
Rep41   97120

Ties are included, as expected.

By the way, you probably already know this, but you cannot use subqueries in MySQL versions before 4.1. (Version 3.23 is positively ancient.) So you're stuck using LIMIT.

However, your "workaround" to run a test query to look at the third highest number is actually an excellent strategy. You may just need to do it a few times. An example of this is given in TOP n WITH TIES in MySQL (3 June 2004).

Dig Deeper on Oracle and SQL

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.