Selecting random rows

Hi Rudy.
Is

    Requires Free Membership to View

    By submitting your registration information to SearchOracle.com you agree to receive email communications from TechTarget and TechTarget partners. We encourage you to read our Privacy Policy which contains important disclosures about how we collect and use your registration and other information. If you reside outside of the United States, by submitting this registration information you consent to having your personal data transferred to and processed in the United States. Your use of SearchOracle.com is governed by our Terms of Use. You may contact us at webmaster@TechTarget.com.

there a way to get a random (and not sequential) of X rows (with all data) in one query (and not X identical queries)?


Short answer: No.

Long answer: Yes, sort of, but you might not like it.

There is no way in standard SQL to select rows randomly. Some sort of programming is required.

Let's say you have a table with many rows and the primary key is an integer, like an autonumber or identity key or sequence number. If you can generate random keys from the range of primary keys in use, these may be used to select rows.

First, you need a "driver" table which will hold the generated keys --

create table Driver
  ( driverID autonumber
  , randno integer)

Next, find the range of primary keys in use --

select min(primaryID) as MinID
     , max(primaryID) as MaxID
  from yourTable

Now, generate some random keys using the min and max ID values as the range bounds (note: the following is pseudocode, not any specific programming language) and insert them into the driver table --

for 1 <= j <= 100
  r = random(minID, maxID)
  insert into Driver(randno) values (r)
end

Finally, run your select using the driver table of random keys --

select yourTable.*
  from Driver, yourTable
 where yourTable.ID =
     (select min(primaryID) from yourTable
       where primaryID >= Driver.randno)

Ugly, to say the least. And only pseudorandom. And not tested lately. Note the subquery is required in case there are gaps in the actual primary keys in yourTable.

Another way is to arbitrarily take a substring out of a string conversion of some convenient column in yourTable like a timestamp --

select TOP 100 yourtable.*
     , substring(cast(yourTimestamp as char(20))
                 from 14 to 18) as foo
  from yourTable
 order by foo desc

But that's even less random.

The best solution, for situations where you only need a small number of rows, such as in testing, is to have a sample table which contains not random rows of live data, but test cases -- in other words, a place where you can collect those rows which are of interest or which represent conditions you want to test for.

For More Information


This was first published in June 2001

Join the conversationComment

Share
Comments

    Results

    Contribute to the conversation

    All fields are required. Comments will appear at the bottom of the article.