Q
Problem solve Get help with specific problems with your technologies, process and projects.

Summing quantities in gapless sequences

Here's a tough one for our SQL expert: how to sum quantities in gapless sequences?

At an interview for a data warehousing position, they asked me to write a query to get the below result from given...

dataset:

```DATA SET:

SMID  CSID  PURDATE  PURQTY
----  ----  -------  ------
1      1     200501    10
1      1     200502    12
1      1     200503    9

1      1     200507    10
1      1     200508    8

1      2     200505    10
1      2     200506    15

RESULT OF QUERY SHOULD BE:

SMID  CSID  STARTDT  ENDDATE  QTY
----  ----  -------  -------  ----
1      1    200501   200503    31
1      1    200507   200508    18
1      2    200505   200506    25
```

Unfortunately I could not figure out the expected answer. Please, can you take a look at it?

Oh, that's tricky. That's a pretty tough problem to throw at somebody in an interview.

Obviously what they were after was an analysis involving gap-less sequences. There are two sequences for SMID=1 CSID=1, because of the gap between 200503 and 200507.

First, let's find the sequences. This is accomplished by looking for values that occur just preceding and just following a possible sequence. If there are none, then we have a sequence, although it may have gaps:

```select r1.SMID
, r1.CSID
, r1.PURDATE     as STARTDT
, r2.PURDATE     as ENDDATE
, ( select count(*)
from purchases
where SMID = r1.SMID
and CSID = r1.CSID
and PURDATE
between r1.PURDATE
and r2.PURDATE ) as seq_count
, r2.PURDATE - r1.PURDATE  + 1  as seq_diff
from purchases as r1
inner
join purchases as r2
on r2.SMID = r1.SMID
and r2.CSID = r1.CSID
and r2.PURDATE > r1.PURDATE
and not exists
( select 1
from purchases
where SMID = r1.SMID
and CSID = r1.CSID
and PURDATE IN
( r1.PURDATE - 1
, r2.PURDATE + 1 ) )```

The query joins the table to itself based on SMID and CSID, such that the r2 PURDATE value is greater than the r1 value. (Yes, you are allowed to write an INNER JOIN that does not use equality as the join condition.) The NOT EXISTS subquery stipulates that the preceding or following value for the same SMID and CSID must be missing. Thus r1 and r2 are the endpoints of a sequence.

This query produces the following results:

```SMID CSID STARTDT ENDDATE seq_count seq_diff
---- ---- ------- ------- --------- --------
1    1    200501  200503    3         3
1    1    200501  200508    5         8
1    1    200507  200508    2         2
1    2    200505  200506    2         2```

Check the STARTDT and ENDDATE values of each result row to verify that the NOT EXISTS condition has been satisfied.

Notice that the count of the number of values in the sequence has been calculated, as well as the difference between first and last value. You can see immediately that the result rows we are interested in are the ones where these calculations are equal, which means that there are no internal gaps. The range 200501-200508 will be dropped because the difference is 8 but the count is only 5, which means there is a gap.

So let's move those calculations to the WHERE clause, and then use the filtered result set, which now contains only gap-free sequences, as a derived table in a join back to the main data table, with GROUP BY to get the sum of the quantities.

```select gapfree.SMID
, gapfree.CSID
, gapfree.STARTDT
, gapfree.ENDDATE
, sum(data.PURQTY) as QTY
from (
select r1.SMID
, r1.CSID
, r1.PURDATE     as STARTDT
, r2.PURDATE     as ENDDATE
from purchases as r1
inner
join purchases as r2
on r2.SMID = r1.SMID
and r2.CSID = r1.CSID
and r2.PURDATE > r1.PURDATE
and not exists
( select 1
from purchases
where SMID = r1.SMID
and CSID = r1.CSID
and PURDATE IN
( r1.PURDATE - 1
, r2.PURDATE + 1 ) )
and ( select count(*)
from purchases
where SMID = r1.SMID
and CSID = r1.CSID
and PURDATE
between r1.PURDATE
and r2.PURDATE )
= r2.PURDATE - r1.PURDATE  + 1
) as gapfree
inner
join purchases as data
on data.SMID = gapfree.SMID
and data.CSID = gapfree.CSID
and data.PURDATE
between gapfree.STARTDT
and gapfree.ENDDATE
group
by gapfree.SMID
, gapfree.CSID
, gapfree.STARTDT
, gapfree.ENDDATE```

Seems a lot to expect of someone in an interview. Are you sure this wasn't a homework question? <grin>

Does anyone have a solution involving analytic SQL?

This was last published in November 2007

Have a question for an expert?

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

Start the conversation

Send me notifications when other members comment.

SearchDataManagement

• Focus, scope and spotting opportunity are key to role of CDO

Chief data officers and experts see the CDO role as changing to a more strategic orientation -- especially finding key ...

• Good data quality for machine learning is an analytics must

As companies add machine learning applications, they need to really understand -- and be able to improve -- their data. That's ...

• Chief data officer role: Searching for consensus

The chief data officer role is about many things -- regulations, innovation, AI and more. Consultant Randy Bean discussed the ...

• Data analytics in government efforts lack structure

Data analytics in government agencies lack organization, focusing on immediate problems instead of attacking underlying causes ...

• A complete guide to buying data discovery software

Data discovery software turns data into business insights, and it's vital for GDPR compliance. This guide outlines the products ...

• Rising demand for business analytics education programs

Colleges and universities are increasingly offering business analytics degrees. The graduates can help build IT and business ...

SearchSAP

• On-premises, hosted most popular S/4HANA deployment options

The pure cloud -- SaaS -- version of SAP's newest ERP, S/4HANA Cloud, lacks some of the same features of the on-premises version....

• S/4HANA public cloud version can get lost in cloud confusion

The 'true' public cloud is the streamlined SaaS version of on-premises S/4. But private cloud options are often conflated with ...

• SAP S/4HANA migration: What you need to know

There's a lot to consider when contemplating a move to SAP S/4HANA, and this essential guide provides a starting point, including...

SearchSQLServer

• Six sample databases for SQL Server and how to find them

SQL Server sample databases are useful for test and dev, but they can be difficult to parse. Use this SQL database sample ...

• A quick tutorial on SQL Server maintenance plans

SQL Server maintenance plans get a bad rap, but for DBAs who need a simple way to maintain databases, Microsoft's built-in tools ...

• Proposed Microsoft-GitHub buy confirms open source role in cloud

The looming Microsoft-GitHub pairing confirms the company's rebirth as an open source friend. Data tools on the Azure cloud are ...

TheServerSide.com

• Jenkins Git integration: GitHub pull request via the Git plugin

This Jenkins Git integration tutorial demonstrates how to create a freestyle build job that performs a Jenkins GitHub pull ...

• Financial firms, vendors push self-service software delivery

Self-service DevOps automation appeals to enterprises that must push out new code as they adapt to changing requirements.

• IT projects and software teams need to include Agile people

Not every idea deserves equal weight in a software development project, but Agile people know that garnering input from a wide ...

SearchDataCenter

• Four top open source SIEM tools you should know

Open source SIEM software offers organizations a way to test capabilities and augment existing functionalities for analytics and ...

• Rackspace colocation program hosts users' legacy servers

Rackspace now has a managed colocation program that it hopes to upsell its customers with additional services, once their servers...

Broadcom has acquired CA Technologies in a move some believe is largely financially motivated, while others see an opportunity ...

SearchContentManagement

• At OpenText Enterprise World, security and AI take center stage

OpenText unveiled its new application, OT2, at OpenText Enterprise World 2018, while also touting the importance of security. ...

• Augmented reality devices speed van repairs at Volkswagen U.K.

Augmented reality headsets for garage mechanics speed collaboration between repair shops and experts in the home office to solve ...

• Endpoint security tool fueled OpenText's Guidance Software acquisition

Endpoint security was the primary draw for OpenText's Guidance Software acquisition. But plans to improve e-discovery and data ...

SearchHRSoftware

• Cost, doubt about tech hold back AI for HR investment

AI technology will improve the productivity of HR departments by eliminating many routine and transactional processes. But what ...

• Health and wellness benefits are exploding; are you keeping up?

A WorldatWork survey shows companies offered a wide variety of health and wellness benefits at an expanded rate last year. Is ...

• Automated recruiting solves Groupon's sourcing talent woes

Building a talent pool through effective sourcing is a major effort by Groupon. It is using a recruiting automation tool to find ...

Close