Q

# Table differences

I have 2 tables (1/2 mil. rows each) that should be identical, but one has more records, for whatever reason. What...

is the best, most efficient way to determine which records are different? (There are no duplicates in either table.) I have heard subselects, joins, etc. from others.

"Best" and "most efficient" are not necessarily congruent. Sometimes a good solution (easy to write, easy to understand, easy to maintain) performs horribly. Sometimes the most efficient solution requires query gyrations that I would not classify as a good solution. In your case, where the tables are of reasonable size, indexes will be important no matter what you do.

You are right that you can achieve what you want several ways -- subselects, joins, and special operators.

Let's use table1 and table2 as our example tables, and let's assume we want to check for different records in both of them.

The subselect method goes like this --

```select table1.columns
from table1
where not exists
(select 1 from table2
where table2.id = table1.id)```

This gives you all the rows in table1 that don't have matching rows in table 2. Note that in the subselect after the word SELECT it is necessary to select something, so conveniently choose the integer 1 instead of a table column -- it could be anything, really (including the asterisk, but that's a different subject for another day). Since a NOT EXISTS will always evaluate only true or false, the subselect doesn't need to return anything other than an indication that a row was or was not found. (If this sounds familiar, it's my standard spiel about the EXISTS subselect, which I last used in this answer.)

We also want to check for rows in table2 that don't have matching rows in table 1, and this second query is like the previous one, but with the tables reversed --

```select table2.columns
from table2
where not exists
(select 1 from table1
where table1.id = table2.id)```

The second method involves using left joins instead of subselects --

```select table1.columns
from table1 left join table2
on table1.id = table2.id
where             table2.id is null```

This may sound a little weird, joining on a column and checking it for nulls, but that is exactly what to do to find those rows of table1 which do not have a matching row from table2. In a left join, the database places nulls into all the columns from table2 when there is no matching row from table2.

And to find all the rows from table2 that are different, that don't have a match in table1, we could use "table1 right join table2" instead of a left join, but that just confuses things unduly and I prefer to write left joins in all cases --

```select table2.columns
from table2 left join table1
on table2.id = table1.id
where             table1.id is null```

The third method is the "best" solution in my opinion, because it uses SQL language operators intended for just this situation. However, not all databases implement these operators.

To find all the rows of table1 that do not exist in table2, use this query --

```select table1.columns
from table1
except
select table2.columns
from table2```

The EXCEPT operator is called MINUS in Oracle.

We also want the rows of table2 that aren't in table1, and the query for that is, yup, you guessed it --

```select table2.columns
from table2
except
select table1.columns
from table1```

As for efficiency, the database will determine its own access strategy -- for instance, subselects are usually implemented as though they were joins anyway. I haven't seen how the EXCEPT operator is implemented, but it's fair to assume that it will be just as efficient as the other methods. Don't forget your indexes on the primary keys!

This was last published in May 2001

## Content

Find more PRO+ content and other member only offers, here.

#### Have a question for an expert?

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

#### Start the conversation

Send me notifications when other members comment.

## SearchDataManagement

• ### With better scaling, semantic technology knocks on enterprise's door

Cambridge Semantics CTO Sean Martin says better scalability can lead to richer representations of data. Such advances are behind ...

• ### Big data management and analytics weather tumult -- with more in store

Cloud had a big impact on big data management and analytics last year. Machine learning and streaming designs will contribute to ...

• ### Embedded analytics to feel widest impact of machine learning projects

Ovum analyst Tony Baer discusses machine learning tools, IoT-driven streaming analytics and Hadoop in the cloud, all of which ...

• ### Efforts to monetize data should be built for the long haul

Most companies have data monetization opportunities they could exploit, experts say. But a clear strategy and long-term plan are ...

• ### Debate over big data and privacy is just getting started

For years, the tension between privacy and big data has been apparent, but with emerging technologies generating huge amounts of ...

• ### Missions for monetizing data need lift from upfront groundwork

Organizations launching data monetization strategies should factor some key initial steps into their plans to develop ...

## SearchSAP

• ### SAP HANA, express edition makes HANA easier for non-SAP developers

Relaxed RAM and licensing requirements and the ability to run on local machines could help popularize the HANA in-memory platform...

• ### SAP futurists name five future technology trends to plan for now

Three SAP futurists explain what future technology trends in augmented reality, blockchain, AI, robotics, and contingent labor ...

• ### Abakus acquisition beefs up SAP Hybris marketing portfolio

SAP acquires Abakus marketing measurement software for SAP Hybris marketing suite; openSAP adds new courses for S/4HANA, data ...

## SearchSQLServer

• ### Options for scaling out SQL Server applications to boost workloads

Scaling out a database to meet the needs of a heavy processing workload can be a challenge. Here are details on the SQL Server ...

• ### Four trends that will impact SQL Server DBAs in 2017

Flash storage adoption, cloud computing's growth, Linux's increased importance and broader big data integration are a few trends ...

• ### DATEADD and DATEDIFF SQL functions for datetime values

DATEADD and DATEDIFF SQL functions allow you to easily perform calculations, like adding a time interval from a datetime value. ...

## TheServerSide

• ### Is JSON and XML your REST performance bottleneck?

Learn how ASCII encoding formats like JSON and REST can adversely impact server application performance compared with emerging ...

• ### Making multi-cloud deployment a reality at Netflix with Spinnaker

Learn how Netflix leveraged automation and Spinnaker to perform 4,000 multi-cloud deployments per day.

• ### The benefits of Java microservices in a Docker and DevOps world

To seasoned developers, microservices may sound like SOA by another name. But Java microservices apps delivered via Docker ...

## SearchDataCenter

• ### HPE-SimpliVity deal raises support, price and development questions

With HPE's buy of No. 2 SimpliVity -- the first big deal in the HCI space -- IT pros see a more robust offering, but also higher ...

• ### Converged infrastructure drop-off doesn't mean data center death

Traditional converged infrastructure has been supplanted by hyper-converged infrastructure and cloud computing, but it remains a ...

• ### New options to evolve your data backup and recovery plan

The server backup market first evolved to protect VMs, but now it's undergoing another transformation. Find out how it's evolved ...

## SearchContentManagement

• ### Why SharePoint Framework could benefit business, not just developers

SharePoint Framework heralds a new, mobile-friendly, low-code era for Microsoft, which may give some organizations without .NET ...

• ### Using Microsoft MyAnalytics to improve workers' time management

Microsoft MyAnalytics has been hailed as the next great thing for analyzing worker productivity. But how can employees be assured...

• ### Is the Drupal content management system right for your business?

The open source Drupal content management system offers users responsive design and WYSIWYG content and web development tools, as...

## SearchFinancialApplications

• ### ADP acquires performance coaching and employee engagement software

The Marcus Buckingham Company's cloud-based performance management and employee engagement software is set to be integrated into ...

• ### Six keys to creating strong data-security measures

The rush to embrace digital technologies can put organizations at extreme risk. Here are six foundations for creating an ...

• ### Group-chat software sees explosive growth and intense competition

Microsoft Teams and Workplace by Facebook are facing off against rivals such as HipChat and Slack in a high-stakes competition ...

Close