In Top 3 counts querying two tables (27 June 2005), there is a potential trap. Actually it can exist in any "top" type report. Let's say that you are going to pay commission bonuses to the top three sales performers this month out of 100 reps. If a query is written just to get the top three, you might find that Rep11 = $100,000, Rep13 = $98,500, Rep28 = $97,120 and Rep41 = $97,120. All of the other reps are below $97,000. However, just retrieving the top three will probably miss the fourth (equally deserving) rep. I have run into this in more than one "top rank" type report. It might be wise to check that no TIES exist. I believe that Oracle has a ties function that allows for just this circumstance. In MySQL (version 3.23.xx) I usually run a minimal test query with the limit, i.e. LIMIT 3, capture the lowest value, and then run the actual query using HAVING as the restriction. You have been very helpful to me in the past and I really appreciate your insights.
Requires Free Membership to View
You are right to be concerned about ties. However, the problem you mention happens only if you're using (one version of) TOP, or LIMIT. The queries given in my previous answer actually do take ties into consideration.
As far as using TOP or LIMIT is concerned, I'm not sure about Oracle, but in Microsoft Access, ties are automatically included in TOP queries, and in Microsoft SQL Server, you can say TOP n WITH TIES. In MySQL, LIMIT works strictly on rows, not on values. But all of this is proprietary syntax, and standard SQL would be much preferable.
More specifically, let's have a look at the generic top 10 query which was given in FIRST N rows, TOP N rows, LAST N rows, BOTTOM N rows...:
select foo
, bar
from yourtable X
where ( select count(*)
from yourtable
where foo > X.foo ) < 10
Let's see if we can adapt the generic SQL to your example. We'll add a couple more rows for demonstration purposes:
create table repsales ( repno varchar(7) , sales integer ) insert into repsales values ( 'Rep13', 98500 ) insert into repsales values ( 'Rep28', 97120 ) insert into repsales values ( 'Rep11', 100000 ) insert into repsales values ( 'Rep41', 97120 ) insert into repsales values ( 'Rep9', 90937 ) insert into repsales values ( 'Rep37', 91937 )
Now let's try the generic query:
select repno
, sales
from repsales X
where ( select count(*)
from repsales
where sales > X.sales ) < 3
order
by sales desc
Results:
repno sales ------- ----------- Rep11 100000 Rep13 98500 Rep28 97120 Rep41 97120
Ties are included, as expected.
By the way, you probably already know this, but you cannot use subqueries in MySQL versions before 4.1. (Version 3.23 is positively ancient.) So you're stuck using LIMIT.
However, your "workaround" to run a test query to look at the third highest number is actually an excellent strategy. You may just need to do it a few times. An example of this is given in TOP n WITH TIES in MySQL (3 June 2004).
This was first published in September 2005
Join the conversationComment
Share
Comments
Results
Contribute to the conversation