I am carrying out an inner join across three separate tables. I have a unique primary key in the first table, which is then added multiple times as a general numeric field in the second table, as Customers details are updated. The third table is just general additional customer information.
The problem I have to overcome is - how do I ally the unique entry in the first table with only one (the latest) matching id in the second table, to show the latest data (remembering there may be several data rows containing the same id). The data has to be laid out in three separate tables, and I have tried various "distinct" or "top 1" or "order desc" type queries, none of which appear to return the one record I require.
In order to determine which row is the latest, you need a column which will have values that always increase. Two suitable candidates are:
auto_increment, autonumber, sequence, or identity -- depending on your database system
timestamp or datetime -- depending on your database system
The idea here is that when a new row is added, this column's value will always be greater than any value currently in the table. Auto_increment (or similar) columns are sequential, so they work great, even if subsequent deletions leave gaps (see Gaps in autonumber sequences). Timestamp or datetime values work well too, assuming that there is no possibility of two rows being added at the exact same millisecond (or whatever the lowest fraction of the timestamp/datetime column is).
Let's say the three tables looked like this:
create table customers ( cust_id integer not null primary key , cust_name varchar(100) ) create table cust_contacts ( cust_id integer not null , foreign key (cust_id) references customers (cust_id) , contact_date datetime , primary key (cust_id, contact_date) , contact_by varchar(9) ) create table cust_addresses ( cust_id integer not null , foreign key (cust_id) references customers (cust_id) , address_line1 varchar(100) , address_line2 varchar(100) , city varchar(50) )
Then your query would be:
select t1.cust_id , t1.cust_name , t2.contact_date , t2.contact_by , t3.address_line1 , t3.address_line2 , t3.city from customers as t1 inner join cust_contacts as t2 on t1.cust_id = t2.cust_id and t2.contact_date = ( select max(contact_date) from cust_contacts where cust_id = t1.cust_id ) inner join cust_addresses as t3 on t1.cust_id = t3.cust_id
Note that in the subquery, another "copy" of the cust_contacts table is used. This is a correlated subquery, which finds the latest contact_date for the same cust_id.
This was first published in October 2004