Q

# What does this theta self-join mean?

Could you please explain to me the DB2 query given below?? It is a self join, but the part which I cannot understand...

is how it manages to generate sequential order numbers with the help of self join and where clause.

```SELECT count(*) + 1000,
A.SLS_TM_NBR,
A.FFO_TYP_CD,
A.FFO_NBR
FROM
WHERE
((A.SLS_TM_NBR > B.SLS_TM_NBR) OR
(A.SLS_TM_NBR = B.SLS_TM_NBR AND
?? A.FFO_TYP_CD > B.FFO_TYP_CD) OR
(A.SLS_TM_NBR = B.SLS_TM_NBR AND
?? A.FFO_TYP_CD = B.FFO_TYP_CD AND
?? A.FFO_NBR >= B.FFO_NBR))
GROUP BY
A.SLS_TM_NBR,
A.FFO_TYP_CD,
A.FFO_NBR
```

Don't let the complex join condition throw you. This self-join simply counts rows which are "less than" the row in question. This type of join is not an equi-join, where rows are matched that have equal values in some key. Rather, each row is joined to all the rows that satisfy the inequality. This is called a theta join even though few people use this terminology. (I just call it a "less than" join.) With each row joined to all its lesser rows, the GROUP BY generates a COUNT(*) which simply indicates how many lesser rows each row has.

One example where this type of join is useful is in determining a ranking. Imagine a table like this --

```theID  theValue
a       6
b       8
c      12
d       7
e      11
f      15
g      13
h      21
i       9```

Running a simple query with an ORDER BY will get you the records in the right sequence, but using a theta self-join will generate rankings. Remember that the ranking is actually just the number of lesser rows --

```select A.theValue
, count(*) as theRank
from theTable as A
, theTable as B
where A.theValue >= B.theValue
group by
A.theValue

theValue theRank
6       1
7       2
8       3
9       4
11       5
12       6
13       7
15       8
21       9```

Notice that the results are ranked from smallest to largest. This would be the case when ranking golf scores or demerit points, say. For other queries, where the top rank is the largest, the theta condition would be a "greater than" comparison instead of "less than." Also, notice I had to use "less than or equal to" to generate a result row for every row in the A table. Otherwise, the lowest row is left out, as there are no rows lower than the lowest.

Warning: the above query will work correctly only if there are no "ties." If you have ties in your data, the ranking query is substantially more complex.

So, given that the theta join conveniently produces a number, which may or may not be a ranking, but which in your example was further massaged by adding three zeroes to it, is this a good way to generate order numbers? The answer is maybe. It is not a good method if there are ties, because then there will be more than one row with the same number of lesser rows, and hence the same order number!! For your own comfort, double-check to make sure there are UNIQUE constraints on the table columns that are used in the WHERE clause.

This was last published in July 2001

## Content

Find more PRO+ content and other member only offers, here.

#### Have a question for an expert?

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

#### Start the conversation

Send me notifications when other members comment.

## SearchDataManagement

• ### With better scaling, semantic technology knocks on enterprise's door

Cambridge Semantics CTO Sean Martin says better scalability can lead to richer representations of data. Such advances are behind ...

• ### Big data management and analytics weather tumult -- with more in store

Cloud had a big impact on big data management and analytics last year. Machine learning and streaming designs will contribute to ...

• ### Embedded analytics to feel widest impact of machine learning projects

Ovum analyst Tony Baer discusses machine learning tools, IoT-driven streaming analytics and Hadoop in the cloud, all of which ...

• ### Efforts to monetize data should be built for the long haul

Most companies have data monetization opportunities they could exploit, experts say. But a clear strategy and long-term plan are ...

• ### Debate over big data and privacy is just getting started

For years, the tension between privacy and big data has been apparent, but with emerging technologies generating huge amounts of ...

• ### Missions for monetizing data need lift from upfront groundwork

Organizations launching data monetization strategies should factor some key initial steps into their plans to develop ...

## SearchSAP

• ### SAP HANA, express edition makes HANA easier for non-SAP developers

Relaxed RAM and licensing requirements and the ability to run on local machines could help popularize the HANA in-memory platform...

• ### SAP futurists name five future technology trends to plan for now

Three SAP futurists explain what future technology trends in augmented reality, blockchain, AI, robotics, and contingent labor ...

• ### Abakus acquisition beefs up SAP Hybris marketing portfolio

SAP acquires Abakus marketing measurement software for SAP Hybris marketing suite; openSAP adds new courses for S/4HANA, data ...

## SearchSQLServer

• ### Options for scaling out SQL Server applications to boost workloads

Scaling out a database to meet the needs of a heavy processing workload can be a challenge. Here are details on the SQL Server ...

• ### Four trends that will impact SQL Server DBAs in 2017

Flash storage adoption, cloud computing's growth, Linux's increased importance and broader big data integration are a few trends ...

• ### DATEADD and DATEDIFF SQL functions for datetime values

DATEADD and DATEDIFF SQL functions allow you to easily perform calculations, like adding a time interval from a datetime value. ...

## TheServerSide

• ### Is JSON and XML your REST performance bottleneck?

Learn how ASCII encoding formats like JSON and REST can adversely impact server application performance compared with emerging ...

• ### Making multi-cloud deployment a reality at Netflix with Spinnaker

Learn how Netflix leveraged automation and Spinnaker to perform 4,000 multi-cloud deployments per day.

• ### The benefits of Java microservices in a Docker and DevOps world

To seasoned developers, microservices may sound like SOA by another name. But Java microservices apps delivered via Docker ...

## SearchDataCenter

• ### HPE-SimpliVity deal raises support, price and development questions

With HPE's buy of No. 2 SimpliVity -- the first big deal in the HCI space -- IT pros see a more robust offering, but also higher ...

• ### Converged infrastructure drop-off doesn't mean data center death

Traditional converged infrastructure has been supplanted by hyper-converged infrastructure and cloud computing, but it remains a ...

• ### New options to evolve your data backup and recovery plan

The server backup market first evolved to protect VMs, but now it's undergoing another transformation. Find out how it's evolved ...

## SearchContentManagement

• ### Why SharePoint Framework could benefit business, not just developers

SharePoint Framework heralds a new, mobile-friendly, low-code era for Microsoft, which may give some organizations without .NET ...

• ### Using Microsoft MyAnalytics to improve workers' time management

Microsoft MyAnalytics has been hailed as the next great thing for analyzing worker productivity. But how can employees be assured...

• ### Is the Drupal content management system right for your business?

The open source Drupal content management system offers users responsive design and WYSIWYG content and web development tools, as...

## SearchFinancialApplications

• ### ADP acquires performance coaching and employee engagement software

The Marcus Buckingham Company's cloud-based performance management and employee engagement software is set to be integrated into ...

• ### Six keys to creating strong data-security measures

The rush to embrace digital technologies can put organizations at extreme risk. Here are six foundations for creating an ...

• ### Group-chat software sees explosive growth and intense competition

Microsoft Teams and Workplace by Facebook are facing off against rivals such as HipChat and Slack in a high-stakes competition ...

Close