Hope you can find a solution to this. I've searched the site and tried 'self-joins' but can't get this to work. The current solution we're running requires 15 queries to get the necessary result! I have three MySQL tables:
- One stores articles with publish dates and other details.
- One stores industries.
- One serves as a join for the two, where an article is linked to one or more industries using the article IDs and industry IDs.
I want to display the latest two articles from each industry. Can this be done in one or maybe two queries?
It can be done in one. This is an example of the "Latest X for each Y" common SQL question. See SQL FAQ: Common SQL Questions, part 3 (5 July, 2007). However, this one's a bit trickier since it involves a many-to-many join.
select I.industry , A.title , A.pubdate from industries as I inner join industryarticles as IA on IA.industry_ID = I.ID inner join articles as A on A.ID = IA.article_ID and 2 > ( select count(*) from industryarticles as IA2 inner join articles as A2 on A2.ID = IA2.article_ID where IA2.industry_ID = I.ID and A2.pubdate > A.pubdate ) order by I.industry , A.pubdate desc
The subquery—which is correlated to the outer query using the I correlation variable to link industries and the A correlation variable to link articles—counts the number of articles in the same industry which have a later date than the article in the outer query. This count must be less than the number 2. In other words, it must be 1 or 0. If the article being considered in the outer query has only 1 or 0 articles in the same industry with a later date, then it has to be among the latest two for that industry.
This was first published in July 2007