Problem solve Get help with specific problems with your technologies, process and projects.

How and why to use CUME_DIST

How to use Oracle's cumulative distribution function.

On numerous occasions, I have been required to pull the top or the bottom 20% or 30% (or any other value) from...

the dataset. The Oracle 8i analytical function CUME_DIST (cumulative distribution) is a nice and easy way to do this.

A brief description

CUME_DIST is an analytic function. It computes the relative position of a specified value in a group of values. For a row R, assuming ascending ordering, the CUME_DIST of R is the number of rows with values lower than or equal to the value of R, divided by the number of rows being evaluated (the entire query result set or a partition). The range of values returned by CUME_DIST is >0 to <=1. Tie values always evaluate to the same cumulative distribution value.

How to use it

Suppose we have a table TEST with only two columns:

ORACLE_8.1.7>desc test; 
Name       Type 
------  ------- 
ID       NUMBER 
VALUE    NUMBER 

The table has the following data:

ORACLE_8.1.7>select * from test; 

ID       VALUE 
------  ------ 
1          180 
2          100 
3         1000 
4           10 
5            1 

The problem is to find the top 20% of the data (i.e. one -- the top -- record out of the five records in the TEST table). Here is the solution:

ORACLE_8.1.7> select id,value from
(select id, value,cume_dist() over (order by value desc) 
cd from test) where cd <= 0.20        

ID     VALUE 
----- ------ 
3       1000 

Note that the cd value can be changed to 30% or 40% or any other value.

For More Information


Dig Deeper on Oracle and SQL

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

-ADS BY GOOGLE

SearchDataManagement

SearchBusinessAnalytics

SearchSAP

SearchSQLServer

TheServerSide.com

SearchDataCenter

SearchContentManagement

SearchHRSoftware

Close