Q
Problem solve Get help with specific problems with your technologies, process and projects.

# Counting rows in runs

## I have a table with fields MarketId and XY, where MarketID is a unique character field and for this example XY can only contain "X" or "Y." I need to count the occurrences of field XY and end up with a result like this.

If I have a table with fields MarketId and XY, where MarketID is a unique character field and for this example...

XY can only contain "X" or "Y." My example data is this:

```MarketID  XY
--------  --
1         X
2         X
3         X
4         Y
5         Y
6         X
7         X```

What I need to do is count the occurrences of field XY but in MarketId order to end up with a result something like this:

```XY    Count
--    -----
X       3
Y       2
X       2```

These are called "runs" or "sequences" and the SQL can sure be tricky.

In this table, the MarketID column provides the inherent ordering of data. Image the IDs as points on a line:

`------o------o------o------o------`

Now we will self-join this table:

```  from Market_XY as t1
inner
join Market_XY as t3
on t3.MarketID > t1.MarketID
and t3.XY = t1.XY ```

We will pair up each row (t1) to all other rows (t3) that have a higher ID as well as the same XY. The same XY means these two rows are a potential run:

```------o------o------o------o------
|      |      |      |
t1     t3---->t3---->t3---->```

Next, we self-join once more, but with a LEFT OUTER JOIN, using an IS NULL condition in the WHERE clause to make sure no matches are found:

```  from Market_XY as t1
inner
join Market_XY as t3
on t3.MarketID > t1.MarketID
and t3.XY = t1.XY
left outer
join Market_XY as t2
on t2.MarketID between t1.MarketID and t3.MarketID
and t2.XY <> t3.XY
where t2.MarketID is null```

Essentially, we want all t1-t3 runs, where there is no t2 between them with a different XY. The LEFT OUTER JOIN looks for them, but the IS NULL keeps only the t1-t3 pairs where no t2 exists.

But obviously there will be overlaps of runs. In the given sample data, notice that the run (1,2,3) of Xs, has runs (1,2) and (2,3) inside it. What remains now is simply to take the longest runs. This involves careful grouping and the use of both MIN() and MAX().

```select XY
, hiID - min(loID) + 1 as Count
from (
select t1.MarketID as loID
, t1.XY
, max(t3.MarketID) as hiID
from Market_XY as t1
inner
join Market_XY as t3
on t3.MarketID > t1.MarketID
and t3.XY = t1.XY
left outer
join Market_XY as t2
on t2.MarketID
between t1.MarketID
and t3.MarketID
and t2.XY <> t3.XY
where t2.MarketID is null
group
by t1.MarketID
, t1.XY
) as hiIDs
group
by XY
, hiID
order
by hiID```

It looks tricky but it really isn't. To understand it, run the subquery alone, without the MAX() or GROUP BY, but showing the t3 columns. Then see what happens with the MAX() and GROUP BY. Then apply the outer query with its GROUP BY and MIN(). Note that MAX - MIN + 1 gives the count for the run.

P.S. If anyone has an analytic SQL solution for this, please do send it in. I would love to see it.

This was last published in March 2007

## Content

Find more PRO+ content and other member only offers, here.

#### Have a question for an expert?

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

#### Start the conversation

Send me notifications when other members comment.

## SearchDataManagement

• ### Hyperledger Fabric offers path to enterprise blockchain future

Blockchain arose from bitcoin, but it's looking to find a place in the enterprise. Frameworks like Hyperledger Fabric could ...

• ### MongoDB 4.0 takes ACID transactions to multi-document level

MongoDB is taking a deeper step into SQL-style processing waters with a 4.0 update that brings increased support for ...

• ### Data lake concept needs firm hand to pay big data dividends

Data lakes pose technology deployment and data management challenges that can leave analytics users high and dry if the ...

• ### AI functionality limited today but could be a game-changer

Limited AI capabilities could soon give way to technology that is truly transformative for enterprises, surpassing the overhyped ...

## SearchSAP

• ### ControlPanelGRC app eases Steelcase's compliance pain

When Steelcase's SAP environment grew in size and complexity, it turned to Symmetry ControlPanelGRC to save time, have more ...

• ### Translytical data platforms emerge with SAP HANA as a leader

SAP HANA is a leading translytical platform, according to Forrester, and consulting firm Convergent IS says the combination of ...

• ### SAP HANA and Esri combine for geospatial database platform

SAP and Esri are combining SAP HANA's in-memory database capabilities with Esri geospatial applications, and utilities are taking...

## SearchSQLServer

SQL Operations Studio simplifies routine administration of SQL Server and Azure SQL databases, making database development and ...

• ### Meltdown and Spectre fixes eyed for SQL Server performance issues

Microsoft has responded to the Spectre and Meltdown chip vulnerabilities with patches and other fixes. But IT teams need to sort ...

• ### Five SQL Server maintenance steps you should take -- ASAP

Putting off SQL Server administration tasks can lead to database problems. Enact these often-neglected maintenance items to help ...

## TheServerSide.com

• ### Pluralsight IQ, Stack Overflow boost developer street cred

Tying the Pluralsight IQ skills test to the Stack Overflow Developer Story helps developers measure their technical skills and ...

• ### Why this quantum computing breakthrough is a security risk

Quantum computing will void pretty much all security encryption techniques and open the door to hackers. Here's how to protect ...

• ### Database automation drives DevOps into the persistence layer

A new breed of software tools is driving DevOps processes deep into the persistence layer, as database automation and continuous ...

## SearchDataCenter

• ### Evaluate read-intensive and write-intensive SSD use cases

Consider write wear, performance and other factors when choosing between read-intensive, write-intensive and mixed-use ...

• ### Some hyper-converged infrastructure use cases pose pitfalls

Hyper-converged infrastructure adoption is skyrocketing, but that doesn't mean that the technology is the best choice for every ...

• ### Dell hyper-converged reorg streamlines products, ups CI odds

Market pressures and manufacturing synergies drove Dell to integrate its HCI and CI products with its core business units, but ...

## SearchContentManagement

• ### Content management in the cloud a main theme in 2018

The future of content management resides in the cloud and with AI, as several 2018 conferences will assure you.

• ### Six things to know about today's SharePoint implementations

As companies migrate their on-premises Microsoft SharePoint sites to the cloud, here are some things they should know about the ...

• ### Upgrades for the SharePoint Online portal

As more organizations migrate SharePoint sites to the cloud, Microsoft has increased at-a-glance dashboard data and analytics to ...

## SearchHRSoftware

• ### Don't overlook the many benefits of Microsoft Excel for HR

The maligned spreadsheet tool is no substitute for enterprise apps like HRMS and people analytics, but it will do in a pinch and ...

• ### HR is failing to use people analytics tools, new report says

Human resource departments fail to use people analytics tools effectively, a new global study concludes. The findings were called...

• ### Does your company need new human resource management tools?

Finding the best human resource management tools starts with identifying your company's present and future needs, before ...

Close