Only the latest matching row

I am carrying out an inner join across three separate tables. I have a unique primary key in the first table, which...

is then added multiple times as a general numeric field in the second table, as Customers details are updated. The third table is just general additional customer information.

The problem I have to overcome is - how do I ally the unique entry in the first table with only one (the latest) matching id in the second table, to show the latest data (remembering there may be several data rows containing the same id). The data has to be laid out in three separate tables, and I have tried various "distinct" or "top 1" or "order desc" type queries, none of which appear to return the one record I require.

In order to determine which row is the latest, you need a column which will have values that always increase. Two suitable candidates are:

  1. auto_increment, autonumber, sequence, or identity -- depending on your database system

  2. timestamp or datetime -- depending on your database system

The idea here is that when a new row is added, this column's value will always be greater than any value currently in the table. Auto_increment (or similar) columns are sequential, so they work great, even if subsequent deletions leave gaps (see Gaps in autonumber sequences). Timestamp or datetime values work well too, assuming that there is no possibility of two rows being added at the exact same millisecond (or whatever the lowest fraction of the timestamp/datetime column is).

Let's say the three tables looked like this:

create table customers
( cust_id integer not null primary key
, cust_name varchar(100)
create table cust_contacts
( cust_id integer not null 
, foreign key (cust_id) references customers (cust_id)
, contact_date datetime
, primary key (cust_id, contact_date)
, contact_by varchar(9)
create table cust_addresses
( cust_id integer not null 
, foreign key (cust_id) references customers (cust_id)
, address_line1 varchar(100)
, address_line2 varchar(100)
, city varchar(50)

Then your query would be:

select t1.cust_id
     , t1.cust_name 
     , t2.contact_date 
     , t2.contact_by 
     , t3.address_line1
     , t3.address_line2 
     , t3.city
  from customers       as t1
  join cust_contacts   as t2
    on t1.cust_id 
     = t2.cust_id
   and t2.contact_date =
     ( select max(contact_date)
         from cust_contacts
        where cust_id = t1.cust_id )     
  join cust_addresses  as t3 
    on t1.cust_id 
     = t3.cust_id 

Note that in the subquery, another "copy" of the cust_contacts table is used. This is a correlated subquery, which finds the latest contact_date for the same cust_id.

This was first published in October 2004
This Content Component encountered an error

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.



Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: