Problem solve Get help with specific problems with your technologies, process and projects.

A "one-to-every" relationship

I am having a difficult time with a "one-to-every" relationship. I am trying to find the employees who worked on EVERY database project. Employees exist on the EmpProj table with EmpNo (Employee number) and ProjNo (Project number). The Project table has ProjNo and project type. I need to find all projects that exist on the Project table that have a ProjType of 'database' and then find only the employees that have worked on EVERY one of them.

The first thing we need to do is find those employees who have worked on any database project:

select EmpNo, count(*) as DatabaseProjects
  from EmpProj inner join Project
    on EmpProj.EmpNo = Project.EmpNo
   and                 Project.ProjType='database'
group by EmpNo

The count that is returned in the DatabaseProjects column will be one of two values:

  1. the number of database projects, assuming the employee can only work on a single project once, i.e. the combination of values in EmpNo and ProjNo in the EmpProj table is unique

  2. the number of assignments to database projects, if the employee can work on a single project more than once

In the first case, COUNT(*) will suffice, but in the second case, to satisfy the original question, we need to use COUNT(DISTINCT ProjNo).

Tip: COUNT(*) is always faster than COUNT(anythingelse), but you may not always be able to use it. Analyze your table structure to determine what's appropriate.

Let's assume the second case, that an employee can work on the same project more than once. If that doesn't sound reasonable, just change "employee" to "consultant" and "project" to "client" and suddenly the need to count distinct clients is more obvious.

select EmpNo
  from EmpProj inner join Project
    on EmpProj.EmpNo = Project.EmpNo
   and                 Project.ProjType='database'
group by EmpNo
having count(distinct ProjNo)
     = ( select count(*)
           from Project
          where ProjType='database' )

In the subquery, we can use COUNT(*) because the Project table has one row per project. Presumably, ProjNo is the primary key and therefore unique and not null. Thus COUNT(DISTINCT ProjNo) would also work, but it would be less efficient as well as unnecessary. COUNT(ProjNo) would count the non-null values of ProjNo, and this too would be less efficient and unnecessary. Use COUNT(*) whenever the situation allows you to.

The subquery is not correlated, which simply means that it is independent of the main query. Therefore its result, which is a single number, can be calculated ahead of time by the database engine, and this number can then be used to compare to the group count for each employee in the HAVING clause. Queries that use uncorrelated subqueries like this are very efficient.

Dig Deeper on Oracle and SQL

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.