What's the syntax for comparing two different tables using SQL?
If rows exist in both tables, but the contents on columns are different, I'd like to update with newest data.
If rows are the same, I would like to delete contents from another table.
If rows don't exist in the second table, I would like to add rows to the second table.
The first thing we need to establish is that these questions are meaningful only if the tables have primary keys. For example, updating all rows in the first table that have different contents from any row in the second table will obviously really mess things up. So let's assume each table has a primary key:
table1: pk1, a1, b1, ...
table2: pk2, a2, b2, ...
Rows will be matched on pk1=pk2, and all other columns are data columns. Let's further assume that table2 is the "driver" and all changes will affect table1.
1. Update with newest data
Some databases (e.g. Microsoft SQL Server) allow you to update one table from another:
update table1 set a1 = a2 , b1 = b2 , ... from table2 where pk1 = pk2 and ( a1 <> a2 or b1 <> b2 or ... )
UPDATE FROM is not, as far as I know, standard SQL, but it's extremely useful. The FROM clause works the same as it does in a join.
Note the use of inequality conditions in the WHERE clause to select, and thus update, only those rows where the columns are different. Depending on how many rows you expect will match, an alternative strategy is to omit these conditions and simply permit the update to proceed for all matching rows. This works well if table1 has many rows and table2 only a few, because even if some matching rows are updated that didn't need to be, you get a performance gain by not testing all the columns.
What if you cannot use UPDATE FROM? In that case, you need to delete and insert, a procedure involving two or three steps, depending on whether there are any columns defined in table1 that don't exist in table2. If there are, you must prepare "updated" rows through a join and save them somewhere, like a working table. Let's say table1 includes column x1, which doesn't exist in table2:
delete from worktable1
insert into worktable1 select pk1, a2, b2, ... x1 from table1 , table2 where pk1 = pk2
delete from table1 where pk1 in ( select pk1 from worktable1 )
insert into table1 select * from worktable1
Whew. Note that since the working table is designed to have exactly the same layout as table1, this is one of the rare cases where "select star" is acceptable, since SELECT statements should ordinarily always list the columns being selected.
If table1 and table2 have identical layouts, then you don't need the working table:
delete from table1 where pk1 in ( select pk2 from table2 )
insert into table1 select * from table2
2. Delete matching rows
This one's easy. It's the first part of the two-step update above. Depending on the context in which matching rows should be deleted, you might run this separately or as part of an update procedure.
3. Insert rows that don't exist
This, too, is part of the two-step update above, although it might not be obvious. If there are any table2 rows that don't have a match in table1, they will get inserted in the second step along with the rows that that were deleted in the first step because they did match. The second step thus assumes that all rows being inserted don't exist. If you are inserting rows that don't exist as a separate procedure from the two-step update, then you may need:
insert into table1 select * from table2 where pk2 not in ( select pk1 from table1 )
Some databases will not let you select from a table that you are inserting into, so one way around this, if your situation allows it, is to delete from table2 any rows that already exist in table1, and then simply insert the rest.