Home > Ask the Oracle Database / Applications Experts > SQL Questions & Answers > Summing quantities in gapless sequences
Ask The Oracle Expert: Questions & Answers
EMAIL THIS

Summing quantities in gapless sequences

Rudy Limeback EXPERT RESPONSE FROM: Rudy Limeback

Pose a Question
Other Oracle Categories
Meet all Oracle Experts
Become an Expert for this site


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   


>
QUESTION POSED ON: 24 November 2007

At an interview for a data warehousing position, they asked me to write a query to get the below result from given dataset:

DATA SET:

SMID  CSID  PURDATE  PURQTY
----  ----  -------  ------
1      1     200501    10
1      1     200502    12
1      1     200503    9

1      1     200507    10
1      1     200508    8

1      2     200505    10
1      2     200506    15

RESULT OF QUERY SHOULD BE:

SMID  CSID  STARTDT  ENDDATE  QTY
----  ----  -------  -------  ----
1      1    200501   200503    31
1      1    200507   200508    18
1      2    200505   200506    25

Unfortunately I could not figure out the expected answer. Please, can you take a look at it?


>

Oh, that's tricky. That's a pretty tough problem to throw at somebody in an interview.

Obviously what they were after was an analysis involving gap-less sequences. There are two sequences for SMID=1 CSID=1, because of the gap between 200503 and 200507.

First, let's find the sequences. This is accomplished by looking for values that occur just preceding and just following a possible sequence. If there are none, then we have a sequence, although it may have gaps:

select r1.SMID
     , r1.CSID
     , r1.PURDATE     as STARTDT
     , r2.PURDATE     as ENDDATE
     , ( select count(*) 
           from purchases
          where SMID = r1.SMID 
            and CSID = r1.CSID
            and PURDATE 
                between r1.PURDATE 
                    and r2.PURDATE ) as seq_count
     , r2.PURDATE - r1.PURDATE  + 1  as seq_diff
  from purchases as r1
inner
  join purchases as r2
    on r2.SMID = r1.SMID
   and r2.CSID = r1.CSID
   and r2.PURDATE > r1.PURDATE
   and not exists
       ( select 1 
           from purchases
          where SMID = r1.SMID
            and CSID = r1.CSID
            and PURDATE IN
                 ( r1.PURDATE - 1
                 , r2.PURDATE + 1 ) )

The query joins the table to itself based on SMID and CSID, such that the r2 PURDATE value is greater than the r1 value. (Yes, you are allowed to write an INNER JOIN that does not use equality as the join condition.) The NOT EXISTS subquery stipulates that the preceding or following value for the same SMID and CSID must be missing. Thus r1 and r2 are the endpoints of a sequence.

This query produces the following results:

SMID CSID STARTDT ENDDATE seq_count seq_diff
---- ---- ------- ------- --------- --------
 1    1    200501  200503    3         3
 1    1    200501  200508    5         8
 1    1    200507  200508    2         2
 1    2    200505  200506    2         2

Check the STARTDT and ENDDATE values of each result row to verify that the NOT EXISTS condition has been satisfied.

Notice that the count of the number of values in the sequence has been calculated, as well as the difference between first and last value. You can see immediately that the result rows we are interested in are the ones where these calculations are equal, which means that there are no internal gaps. The range 200501-200508 will be dropped because the difference is 8 but the count is only 5, which means there is a gap.

So let's move those calculations to the WHERE clause, and then use the filtered result set, which now contains only gap-free sequences, as a derived table in a join back to the main data table, with GROUP BY to get the sum of the quantities.

select gapfree.SMID
     , gapfree.CSID
     , gapfree.STARTDT
     , gapfree.ENDDATE
     , sum(data.PURQTY) as QTY
  from (
       select r1.SMID
            , r1.CSID
            , r1.PURDATE     as STARTDT
            , r2.PURDATE     as ENDDATE
         from purchases as r1
       inner
         join purchases as r2
           on r2.SMID = r1.SMID
          and r2.CSID = r1.CSID
          and r2.PURDATE > r1.PURDATE
          and not exists
              ( select 1 
                  from purchases
                 where SMID = r1.SMID
                   and CSID = r1.CSID
                   and PURDATE IN
                        ( r1.PURDATE - 1
                        , r2.PURDATE + 1 ) )
          and ( select count(*) 
                  from purchases
                 where SMID = r1.SMID 
                   and CSID = r1.CSID
               and PURDATE 
                       between r1.PURDATE 
                           and r2.PURDATE ) 
            = r2.PURDATE - r1.PURDATE  + 1 
       ) as gapfree
inner
  join purchases as data
    on data.SMID = gapfree.SMID
   and data.CSID = gapfree.CSID
   and data.PURDATE 
       between gapfree.STARTDT
           and gapfree.ENDDATE
group
    by gapfree.SMID
     , gapfree.CSID
     , gapfree.STARTDT
     , gapfree.ENDDATE

Seems a lot to expect of someone in an interview. Are you sure this wasn't a homework question? <grin>

Does anyone have a solution involving analytic SQL?


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   



RELATED CONTENT
SQL
How to check SQL query construction with the Mimer Validator
Using the SQL GROUP BY clause for counting combinations
How to use an SQL CASE expression
How to sort an SQL UNION query with special ORDER BY sequence
How to use string functions to make an SQL join
An SQL solution for a customer order homework problem
How to use SQL's POSITION function with substrings
Using SQL date functions to get totals for last three days
Using CASE in the SQL ORDER BY clause
What's the difference between an SQL inner join and equijoin?

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary



Search and Browse the Expert Answer Center
Search and browse more than 25,000 question and answer pairs from more than 250 TechTarget industry experts.
Browse our Expert Advice



Oracle White Papers: Fusion Middleware
HomeNewsTopicsTipsAsk the ExpertsMultimediaWhite PapersProductsBlogs
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2003 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts