Select only the second of duplicates

Select only the second of duplicates

All I want to do is print the second tuple where the NO is repeated:

 NO  NAME
 1   Name1
 2   Name2
 2   Name3
 2   Name4
 3   Name5
 3   Name6
 4   Name7
 5   Name8
 5   Name9

When I run my SQL query, my output must be --

 NO  NAME
 2   Name3
 3   Name6
 5   Name9

What is the query?


    Requires Free Membership to View

    By submitting your registration information to SearchOracle.com you agree to receive email communications from TechTarget and TechTarget partners. We encourage you to read our Privacy Policy which contains important disclosures about how we collect and use your registration and other information. If you reside outside of the United States, by submitting this registration information you consent to having your personal data transferred to and processed in the United States. Your use of SearchOracle.com is governed by our Terms of Use. You may contact us at webmaster@TechTarget.com.

The first thing we need to establish is what you mean by the "second" tuple. There must be a sequence, a way of ordering the rows, that allows us to determine which one is first, and which one is second. By your example, I shall assume that "first" means the lowest collating NAME.

Now for the problem of finding the "second" one. There may be other ways to do it, but here's how I would approach this problem.

The "first" one is the lowest name, and we can get this with the MIN() function and grouping on NO:

select NO, min(NAME)
  from yourtable
group
    by NO

The "second" one is trickier. It is the lowest one that isn't the first one. But this time, instead of using GROUP BY, we use another method to produce grouping, the correlated subquery.

select NO, NAME
  from yourtable XX
 where NAME =
       ( select min(NAME)
           from yourtable
          where NO = XX.NO
            and NAME >
                ( select min(NAME)
                    from yourtable
                   where NO = XX.NO
                )
       )

To see how this works, consider any row in the outer query. Using the correlation variable XX to refer to this row in the outer query, the innermost subquery gets the lowest name for all rows with the same value of NO as the XX row being considered. This lowest value is used by the next outer subquery, which gets the lowest name that's greater than the lowest which was found by the innermost subquery. In other words, the second lowest. Then the outer query gets the row that has that second lowest name.

It's easy when you see it explained, but kind of hard to come up with on your own if you've never seen it before.

For More Information


This was first published in March 2003

Join the conversationComment

Share
Comments

    Results

    Contribute to the conversation

    All fields are required. Comments will appear at the bottom of the article.