I have two tables, with the following fields:
table1 tyear country value1 1998 US 32 1998 US 21 2000 CN 90 2001 US 12 table2 tyear country value2 1998 US 11 1998 US 80 2000 CN 90 2001 US 15
I simply want to sum value1 and value2, and then get the difference between them in one SQL expression. For example, first I choose the date then the country, then I have many results in value1 and value2, and I want to sum value1 and value2 then get the difference between both in another field like "diff". So if I choose tyear=1998 and country=US, what I want is this:
query result tyear country value1 value2 diff 1998 US 53 91 -38
The solution will involve a derived table for each of your two tables. A derived table is just a subquery, and since a subquery, like any query, produces a result set, we simply treat the subquery result set just as we would an ordinary table. This property of SQL is called orthogonality:
From the Greek for "having straight angles". A language is orthogonal if analogous situations have similar rules. For example, in SQL-89 a subquery was legal in a WHERE clause but not in a SELECT list; now it is legal in both places, so in this respect we can say that "SQL-92 is more orthogonal than SQL-89".
-- SQL Glossary
In this case we want to use two derived tables in the FROM clause, like this:
select dt1.tyear , dt1.country , dt1.sumvalue1 as value1 , dt2.sumvalue2 as value2 , dt1.sumvalue1 -dt2.sumvalue2 as diff from ( select tyear , country , sum(value1) as sumvalue1 from table1 group by tyear , country ) as dt1 inner join ( select tyear , country , sum(value2) as sumvalue2 from table2 group by tyear , country ) as dt2 on dt1.tyear = dt2.tyear and dt1.country = dt2.country where dt1.tyear = 1998 and dt1.country = 'US'
This is the form of the query which allows for easiest maintenance, since you have to specify the tyear and country in only one spot.
A similar query, that also produces what you want, is:
select 1998 as tyear , 'US' as country , dt1.sumvalue1 as value1 , dt2.sumvalue2 as value2 , dt1.sumvalue1 -dt2.sumvalue2 as diff from ( select sum(value1) as sumvalue1 from table1 where tyear = 1998 and country = 'US' ) as dt1 cross join ( select sum(value2) as sumvalue2 from table2 where tyear = 1998 and country = 'US' ) as dt2
Can you see the difference? In the second query, the subqueries do not use a GROUP BY. Nevertheless, can you see that they will produce the same results? Which query do you think will be more efficient?
Which query would you use if you needed the difference for two separate tyears?