I would like your explanation of the difference between a LEFT OUTER JOIN ON condition and a WHERE condition. Thank you.
The difference is subtle, but it is a big difference. The ON condition stipulates which rows will be returned in the join, while the WHERE condition acts as a filter on the rows that actually were returned.
Simple example: Consider a student table, consisting of one row per student, with student id and student name. In a second table, a list of grades that students have received, with student_id, subject, and grade. Give me a list of all students, and show their grade in Math. This requires a LEFT OUTER JOIN, because you want all students, and you know that some of them didn't take Math. Here are two queries:
select name , grades.grade as math_grade from students left outer join grades on students.id = grades.student_id where grades.subject = 'Math'
select name , grades.grade as math_grade from students left outer join grades on students.id = grades.student_id and grades.subject = 'Math'
Now for the crucial difference: the first query returns only those students who took Math, and those who didn't are not included. In the second query, all students are included, and those who took Math have their grade shown.
Why the difference? In the first query, the LEFT OUTER JOIN returns all students, even if they didn't take Math. If they didn't take Math, then the joined row that is returned by the LEFT OUTER JOIN will have NULLs in all the columns from the grades table. But then for each such joined row returned, the WHERE clause comes along and picks only those rows which are Math. And since NULL isn't equal to anything, students who didn't take Math disappear from the results.
In the second query, the join condition means that students who took Math are returned, or else NULL because it's a LEFT OUTER JOIN. So all students are included in the results, because there's no WHERE clause to filter them out. Their Math grade will be their Math grade or else NULL.
In effect, the first query behaves the same as an inner join. Only the matched rows are retained after the WHERE clause has done its job. Why bother returning rows to the WHERE clause that you want filtered out? Make it an INNER JOIN and save some needless processing. Of course, if you do want a LEFT OUTER JOIN, make sure that any filter conditions on the right table are in the ON clause, not the WHERE clause.
This was first published in September 2005