Q
Problem solve Get help with specific problems with your technologies, process and projects.

# Table differences

I have 2 tables (1/2 mil. rows each) that should be identical, but one has more records, for whatever reason. What is the best, most efficient way to determine which records are different? (There are no duplicates in either table.) I have heard subselects, joins, etc. from others.

"Best" and "most efficient" are not necessarily congruent. Sometimes a good solution (easy to write, easy to understand, easy to maintain) performs horribly. Sometimes the most efficient solution requires query gyrations that I would not classify as a good solution. In your case, where the tables are of reasonable size, indexes will be important no matter what you do.

You are right that you can achieve what you want several ways -- subselects, joins, and special operators.

Let's use table1 and table2 as our example tables, and let's assume we want to check for different records in both of them.

The subselect method goes like this --

```select table1.columns
from table1
where not exists
(select 1 from table2
where table2.id = table1.id)```

This gives you all the rows in table1 that don't have matching rows in table 2. Note that in the subselect after the word SELECT it is necessary to select something, so conveniently choose the integer 1 instead of a table column -- it could be anything, really (including the asterisk, but that's a different subject for another day). Since a NOT EXISTS will always evaluate only true or false, the subselect doesn't need to return anything other than an indication that a row was or was not found. (If this sounds familiar, it's my standard spiel about the EXISTS subselect, which I last used in this answer.)

We also want to check for rows in table2 that don't have matching rows in table 1, and this second query is like the previous one, but with the tables reversed --

```select table2.columns
from table2
where not exists
(select 1 from table1
where table1.id = table2.id)```

The second method involves using left joins instead of subselects --

```select table1.columns
from table1 left join table2
on table1.id = table2.id
where             table2.id is null```

This may sound a little weird, joining on a column and checking it for nulls, but that is exactly what to do to find those rows of table1 which do not have a matching row from table2. In a left join, the database places nulls into all the columns from table2 when there is no matching row from table2.

And to find all the rows from table2 that are different, that don't have a match in table1, we could use "table1 right join table2" instead of a left join, but that just confuses things unduly and I prefer to write left joins in all cases --

```select table2.columns
from table2 left join table1
on table2.id = table1.id
where             table1.id is null```

The third method is the "best" solution in my opinion, because it uses SQL language operators intended for just this situation. However, not all databases implement these operators.

To find all the rows of table1 that do not exist in table2, use this query --

```select table1.columns
from table1
except
select table2.columns
from table2```

The EXCEPT operator is called MINUS in Oracle.

We also want the rows of table2 that aren't in table1, and the query for that is, yup, you guessed it --

```select table2.columns
from table2
except
select table1.columns
from table1```

As for efficiency, the database will determine its own access strategy -- for instance, subselects are usually implemented as though they were joins anyway. I haven't seen how the EXCEPT operator is implemented, but it's fair to assume that it will be just as efficient as the other methods. Don't forget your indexes on the primary keys!

#### Start the conversation

Send me notifications when other members comment.

## SearchDataManagement

• ### EnterpriseDB looks to grow market for PostgreSQL

Enterprises are increasingly using the open source PostgreSQL database. Read in this Q&A where the growth is coming from and how ...

• ### 3 of the top use cases for graph databases

Graph databases establish many unique relationships between data points. These unusual relationships are beneficial in many use ...

• ### Dell EMC Streaming Data Platform integrates open source technology

Dell combines several open source streaming data technologies, including Apache Kafka, Apache Flink and Pravega, to create a new ...

• ### What Salesforce means for Tableau in the cloud

After the Salesforce acquisition, users have wondered about the future of Tableau in the cloud. Experts weigh in on how the ...

• ### MicroStrategy analytics suite fosters insurance co-op's growth

Using MicroStrategy's BI platform, a Canada-based insurance co-operative has been able to spread analytics to end users ...

• ### Yellowbrick Data, MicroStrategy partnership aims to enhance analytics

The new partnership will enable flexible scalability, support for mixed workloads and multi-cloud support. The goal is to provide...

## SearchSAP

• ### 5 tips for a successful S/4HANA Cloud implementation

Moving to S/4HANA public cloud can help you save on IT maintenance and simplify real-time analysis. Here's a look at what you can...

• ### Take care of data before SAP S/4HANA migration

In this Q&A, Rajesh Rengarethinam of ERP security vendor Appsian discusses why reviewing data security and business processes are...

Business Suite 7 will receive mainstream maintenance until 2027, a decision industry observers say reflects the difficulty of an ...

## SearchSQLServer

• ### SQL Server database design best practices and tips for DBAs

Good database design is a must to meet processing needs in SQL Server systems. In a webinar, consultant Koen Verbeeck offered ...

• ### SQL Server in Azure database choices and what they offer users

SQL Server databases can be moved to the Azure cloud in several different ways. Here's what you'll get from each of the options ...

• ### Using a LEFT OUTER JOIN vs. RIGHT OUTER JOIN in SQL

In this book excerpt, you'll learn LEFT OUTER JOIN vs. RIGHT OUTER JOIN techniques and find various examples for creating SQL ...

## TheServerSide.com

• ### Why the 8 Java primitive data types are not objects

What's the difference between Java primitive types and objects? To start, don't classify primitive types as objects. Let's ...

• ### A brief history of Java: How it forever changed programming

The Java programming language is an important tool in an application developer's quiver. But Java has also undergone many changes...

• ### How to properly perform Java String comparisons

Don't let incorrect Java String object comparisons bog down your code. Follow this tutorial to understand the differences between...

## SearchDataCenter

• ### Learn the main Linux OS components

Linux is one operating choice to run on your infrastructure. Get started with these terms to discover how the OS works and how it...

• ### New Dell EMC PowerEdge servers are built for the rugged edge

New Dell Streaming Media Platform includes two small form factor PowerEdge servers and a Modular Data Center. Dell aims the ...

• ### Initiative aims to improve data center incident reporting

The Data Center Incident Reporting Network hopes to pull back the smoke screen on software and hardware issues to improve ...

## SearchContentManagement

• ### Test yourself on the differences between SharePoint and OneDrive

Businesses use Microsoft SharePoint and OneDrive capabilities for a number of reasons. Test your knowledge on the differences ...

• ### Akeneo rolls out newest version of PIM software

Akeneo PIM version 4.0 features digital asset manager, API-integrated connection modules and AI attribute mapping to better ...

• ### Get to know 4 workflow automation tools

Workflow automation can benefit businesses by making manual processes digital, giving employees more time to work on other tasks....

## SearchHRSoftware

• ### How real-time analytics can benefit HR

For analytics to be truly useful, it must be embedded into user workflows. Learn how HR tools are incorporating such approaches ...

• ### Human vs. AI in recruiting: Why both matter

Recruiters may consider using both the human element and the data brought in by intelligent tools if it can help them make better...

• ### Kronos-Ultimate Software merger creates \$3B firm with product overlap

Kronos Inc. and Ultimate Software are being merged by their private equity owner into a new, yet unnamed, entity. How this will ...

Close