Getting a count of zero for unmatched rows
The question is very clear. Don't feel bad, because everybody stumbles over this problem the first time.
The situation can be illustrated with the following sample data:
References ID Reference 1 HTML 2 CSS 3 PHP 4 SQL Data ID RefID 24 1 24 2 24 4 27 2 27 4 31 1 31 4
In order to understand how to get the correct counts, let's first write a simple LEFT OUTER JOIN and show the results:
select References.ID , References.Reference , Data.ID as Data_ID from References left outer join Data on References.ID = Data.RefID order by References.ID , Data.ID ID Reference Data_ID 1 HTML 24 1 HTML 31 2 CSS 24 2 CSS 27 3 PHP null 4 SQL 24 4 SQL 27 4 SQL 31
Notice the row in the result set for PHP. This row shows up in a LEFT OUTER JOIN, but would be eliminated in an INNER JOIN. (Make sure you understand why.)
Now let's add the GROUP BY and the COUNT() function.
select References.ID , References.Reference , count(*) as Data_rows from References left outer join Data on References.ID = Data.RefID group by References.ID , Data.ID order by References.ID ID Reference Data_rows 1 HTML 2 2 CSS 2 3 PHP 1 4 SQL 3
Wait a minute, that's wrong! We know there are no rows for PHP, but the count is showing 1. Why is this?
The answer is: COUNT(*) counts rows. There is a row for PHP in the result set.
What you have to do is count the number of Data.RefID values. Note that in the result set, the value of this column for the PHP row is NULL. And as you know, aggregate functions like COUNT() and SUM() ignore nulls.
select References.ID , References.Reference , count(Data.RefID) as Data_rows from References left outer join Data on References.ID = Data.RefID group by References.ID , Data.ID order by References.ID ID Reference Data_rows 1 HTML 2 2 CSS 2 3 PHP 0 4 SQL 3
See the difference?