Q
Problem solve Get help with specific problems with your technologies, process and projects.

# What does this theta self-join mean?

Could you please explain to me the DB2 query given below?? It is a self join, but the part which I cannot understand...

is how it manages to generate sequential order numbers with the help of self join and where clause.

```SELECT count(*) + 1000,
A.SLS_TM_NBR,
A.FFO_TYP_CD,
A.FFO_NBR
FROM
WHERE
((A.SLS_TM_NBR > B.SLS_TM_NBR) OR
(A.SLS_TM_NBR = B.SLS_TM_NBR AND
?? A.FFO_TYP_CD > B.FFO_TYP_CD) OR
(A.SLS_TM_NBR = B.SLS_TM_NBR AND
?? A.FFO_TYP_CD = B.FFO_TYP_CD AND
?? A.FFO_NBR >= B.FFO_NBR))
GROUP BY
A.SLS_TM_NBR,
A.FFO_TYP_CD,
A.FFO_NBR
```

Don't let the complex join condition throw you. This self-join simply counts rows which are "less than" the row in question. This type of join is not an equi-join, where rows are matched that have equal values in some key. Rather, each row is joined to all the rows that satisfy the inequality. This is called a theta join even though few people use this terminology. (I just call it a "less than" join.) With each row joined to all its lesser rows, the GROUP BY generates a COUNT(*) which simply indicates how many lesser rows each row has.

One example where this type of join is useful is in determining a ranking. Imagine a table like this --

```theID  theValue
a       6
b       8
c      12
d       7
e      11
f      15
g      13
h      21
i       9```

Running a simple query with an ORDER BY will get you the records in the right sequence, but using a theta self-join will generate rankings. Remember that the ranking is actually just the number of lesser rows --

```select A.theValue
, count(*) as theRank
from theTable as A
, theTable as B
where A.theValue >= B.theValue
group by
A.theValue

theValue theRank
6       1
7       2
8       3
9       4
11       5
12       6
13       7
15       8
21       9```

Notice that the results are ranked from smallest to largest. This would be the case when ranking golf scores or demerit points, say. For other queries, where the top rank is the largest, the theta condition would be a "greater than" comparison instead of "less than." Also, notice I had to use "less than or equal to" to generate a result row for every row in the A table. Otherwise, the lowest row is left out, as there are no rows lower than the lowest.

Warning: the above query will work correctly only if there are no "ties." If you have ties in your data, the ranking query is substantially more complex.

So, given that the theta join conveniently produces a number, which may or may not be a ranking, but which in your example was further massaged by adding three zeroes to it, is this a good way to generate order numbers? The answer is maybe. It is not a good method if there are ties, because then there will be more than one row with the same number of lesser rows, and hence the same order number!! For your own comfort, double-check to make sure there are UNIQUE constraints on the table columns that are used in the WHERE clause.

This was last published in July 2001

## Content

Find more PRO+ content and other member only offers, here.

#### Have a question for an expert?

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

#### Start the conversation

Send me notifications when other members comment.

## SearchDataManagement

In big data news, we find Google TPUs, or Tensor Processing Units, offered as a cloud service, while LinkedIn is open sourcing a ...

• ### Hyperledger Fabric offers path to enterprise blockchain future

Blockchain arose from bitcoin, but it's looking to find a place in the enterprise. Frameworks like Hyperledger Fabric could ...

• ### MongoDB 4.0 takes ACID transactions to multi-document level

MongoDB is taking a deeper step into SQL-style processing waters with a 4.0 update that brings increased support for ...

• ### Tableau vs. Qlik Sense: Pros and cons of the rival analytics tools

Products from the two top data visualization vendors are starting to resemble each other as the need for strong visualizations ...

• ### AI functionality limited today but could be a game-changer

Limited AI capabilities could soon give way to technology that is truly transformative for enterprises, surpassing the overhyped ...

## SearchSAP

SAP paid \$2.4 billion to acquire lead-to-money vendor CallidusCloud, and analysts agree that the significant price may be worth ...

• ### SAP's Barry Padgett on future of SAP Ariba Network

In this Q&A, new SAP Ariba President Barry Padgett discusses the future of procurement and the experience he will bring to Ariba ...

• ### Avoiding SAP indirect access woes requires good faith

Some customers are concerned that SAP will hit them for indirect access licensing fees, but they can avoid trouble if they act in...

## SearchSQLServer

SQL Operations Studio simplifies routine administration of SQL Server and Azure SQL databases, making database development and ...

• ### Meltdown and Spectre fixes eyed for SQL Server performance issues

Microsoft has responded to the Spectre and Meltdown chip vulnerabilities with patches and other fixes. But IT teams need to sort ...

• ### Five SQL Server maintenance steps you should take -- ASAP

Putting off SQL Server administration tasks can lead to database problems. Enact these often-neglected maintenance items to help ...

## TheServerSide.com

• ### IBM hones in on AI talent at developer confab

IBM and others target developers interested in building artificial intelligence applications, as the number of skilled AI ...

• ### How DevOps concepts eluted from cloud computing and service platforms

The popularity of DevOps can be traced back to the emergence of cloud computing. As programmers began scripting their ...

• ### Pluralsight IQ, Stack Overflow boost developer street cred

Tying the Pluralsight IQ skills test to the Stack Overflow Developer Story helps developers measure their technical skills and ...

## SearchDataCenter

• ### IBM Power9 servers seek market inroads to AI, cloud

IBM follows up its first Power9 server with a raft of systems designed to appeal to a wider array of markets -- most notably, AI ...

• ### Evaluate read-intensive and write-intensive SSD use cases

Consider write wear, performance and other factors when choosing between read-intensive, write-intensive and mixed-use ...

• ### Some hyper-converged infrastructure use cases pose pitfalls

Hyper-converged infrastructure adoption is skyrocketing, but that doesn't mean that the technology is the best choice for every ...

## SearchContentManagement

• ### Content management in the cloud a main theme in 2018

The future of content management resides in the cloud and with AI, as several 2018 conferences will assure you.

• ### Six things to know about today's SharePoint implementations

As companies migrate their on-premises Microsoft SharePoint sites to the cloud, here are some things they should know about the ...

• ### Upgrades for the SharePoint Online portal

As more organizations migrate SharePoint sites to the cloud, Microsoft has increased at-a-glance dashboard data and analytics to ...

## SearchHRSoftware

• ### Worksite health services providers on the rise

Employers are turning to workplace health vendors to provide on-site healthcare services to workers to reduce healthcare costs ...

• ### Federal HR applications need a makeover, Trump administration says

U.S. government HR systems are heavily customized and out-of-date. They can't share data effectively, crimping the use of people ...

• ### AI-enabled recruitment management systems seek out bias

Software for recruiters is seen as a tool in identifying hiring bias and improving diversity. The goal is to improve ...

Close