Q
Problem solve Get help with specific problems with your technologies, process and projects.

Top 3 counts, revisited

In "Top 3 counts querying two tables" (27 June 2005), there is a potential trap.

In Top 3 counts querying two tables (27 June 2005), there is a potential trap. Actually it can exist in any "top"...

type report. Let's say that you are going to pay commission bonuses to the top three sales performers this month out of 100 reps. If a query is written just to get the top three, you might find that Rep11 = $100,000, Rep13 = $98,500, Rep28 = $97,120 and Rep41 = $97,120. All of the other reps are below $97,000. However, just retrieving the top three will probably miss the fourth (equally deserving) rep. I have run into this in more than one "top rank" type report. It might be wise to check that no TIES exist. I believe that Oracle has a ties function that allows for just this circumstance. In MySQL (version 3.23.xx) I usually run a minimal test query with the limit, i.e. LIMIT 3, capture the lowest value, and then run the actual query using HAVING as the restriction. You have been very helpful to me in the past and I really appreciate your insights.

You are right to be concerned about ties. However, the problem you mention happens only if you're using (one version of) TOP, or LIMIT. The queries given in my previous answer actually do take ties into consideration.

As far as using TOP or LIMIT is concerned, I'm not sure about Oracle, but in Microsoft Access, ties are automatically included in TOP queries, and in Microsoft SQL Server, you can say TOP n WITH TIES. In MySQL, LIMIT works strictly on rows, not on values. But all of this is proprietary syntax, and standard SQL would be much preferable.

More specifically, let's have a look at the generic top 10 query which was given in FIRST N rows, TOP N rows, LAST N rows, BOTTOM N rows...:

select foo
     , bar
  from yourtable X 
 where ( select count(*) 
           from yourtable  
          where foo > X.foo ) < 10

Let's see if we can adapt the generic SQL to your example. We'll add a couple more rows for demonstration purposes:

create table repsales
( repno varchar(7)
, sales integer
)
insert into repsales values ( 'Rep13',  98500 ) 
insert into repsales values ( 'Rep28',  97120 ) 
insert into repsales values ( 'Rep11', 100000 ) 
insert into repsales values ( 'Rep41',  97120 ) 
insert into repsales values ( 'Rep9',   90937 )
insert into repsales values ( 'Rep37',  91937 )

Now let's try the generic query:

select repno
     , sales
  from repsales X 
 where ( select count(*) 
           from repsales  
          where sales > X.sales ) < 3
order
    by sales desc 

Results:

repno   sales       
------- ----------- 
Rep11   100000
Rep13   98500
Rep28   97120
Rep41   97120

Ties are included, as expected.

By the way, you probably already know this, but you cannot use subqueries in MySQL versions before 4.1. (Version 3.23 is positively ancient.) So you're stuck using LIMIT.

However, your "workaround" to run a test query to look at the third highest number is actually an excellent strategy. You may just need to do it a few times. An example of this is given in TOP n WITH TIES in MySQL (3 June 2004).

This was last published in September 2005

Dig Deeper on Oracle and SQL

PRO+

Content

Find more PRO+ content and other member only offers, here.

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

Start the conversation

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

-ADS BY GOOGLE

SearchDataManagement

SearchBusinessAnalytics

SearchSAP

SearchSQLServer

TheServerSide

SearchDataCenter

SearchContentManagement

SearchFinancialApplications

Close