I have a table with two fields, RequestID and RequestDate. I want a query to return the latest distinct RequestIDs. In the sense, if there are four records of RequestID 1 and three records of RequestID 2, I would like to get back only two records, which have the latest RequestID 1 and the latest RequestID 2.
The standard SQL approach to this problem is really quite straightforward:
select RequestID , RequestDate from yourtable T where RequestDate = ( select max(RequestDate) from yourtable where RequestID = T.RequestID )
This query uses a correlated subquery. Note the use of the T alias to correlate the rows of the subquery to the row of the outer query. In effect, correlation like this is another way to do grouping.
However, if you are using MySQL, versions prior to 4.1, the above will not work. In that case, try this:
select T1.RequestID , T1.RequestDate from yourtable T1 inner join yourtable T2 on T1.RequestID = T2.RequestID group by T1.RequestID , T1.RequestDate having T1.RequestDate = max(T2.RequestDate)
This query uses a self join, joining the table to itself on matching keys, and an explicit GROUP BY clause to perform the same grouping as the first query, with a HAVING clause to filter the rows of the group. Not surprisingly, it produces the same result set. (As the saying goes, "Your mileage may vary; mine doesn't.")
This query works in other databases, too, not just MySQL. I cannot comment on which query is more efficient, though, since each database's optimizer is different; however, I expect a good optimizer to produce efficient, if not identical, execution plans for both queries.
See also Latest two rows for each group.