Ask the Expert

Getting a count of zero for unmatched rows

I have a reference table and a data table and wish to perform a GROUP BY. The thing I am trying to accomplish is the pull EVERY 'reference' from the reference table and count the rows from the data table, respectively. Of course I can simply perform the GROUP BY on the data table and get the stats. But this only gives me existing data counts. I was hoping to build a single query that gives counts, including those references with counts of zero. I believe this should be possible. Hopefully the question is clear.

    Requires Free Membership to View

The question is very clear. Don't feel bad, because everybody stumbles over this problem the first time.

The situation can be illustrated with the following sample data:

References
ID Reference
1  HTML  
2  CSS 
3  PHP
4  SQL
    
Data
ID RefID 
24  1
24  2
24  4
27  2
27  4
31  1
31  4

In order to understand how to get the correct counts, let's first write a simple LEFT OUTER JOIN and show the results:

select References.ID
     , References.Reference 
     , Data.ID     as Data_ID
  from References
left outer
  join Data
    on References.ID = Data.RefID
order
    by References.ID
     , Data.ID      
    
ID Reference Data_ID      
1  HTML      24 
1  HTML      31 
2  CSS       24
2  CSS       27
3  PHP       null
4  SQL       24      
4  SQL       27      
4  SQL       31 

Notice the row in the result set for PHP. This row shows up in a LEFT OUTER JOIN, but would be eliminated in an INNER JOIN. (Make sure you understand why.)

Now let's add the GROUP BY and the COUNT() function.

select References.ID
     , References.Reference 
     , count(*)     as Data_rows
  from References
left outer
  join Data
    on References.ID = Data.RefID
group
    by References.ID
     , Data.ID      
order
    by References.ID
    
ID Reference Data_rows      
1  HTML      2
2  CSS       2
3  PHP       1
4  SQL       3 

Wait a minute, that's wrong! We know there are no rows for PHP, but the count is showing 1. Why is this?

The answer is: COUNT(*) counts rows. There is a row for PHP in the result set.

What you have to do is count the number of Data.RefID values. Note that in the result set, the value of this column for the PHP row is NULL. And as you know, aggregate functions like COUNT() and SUM() ignore nulls.

select References.ID
     , References.Reference 
     , count(Data.RefID)     as Data_rows
  from References
left outer
  join Data
    on References.ID = Data.RefID
group
    by References.ID
     , Data.ID      
order
    by References.ID
     
ID Reference Data_rows      
1  HTML      2
2  CSS       2
3  PHP       0
4  SQL       3 

See the difference?

This was first published in May 2005

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: