I have two tables, Customer and Activity, and related fields are CustNum. The Activity table holds all information...
related to services performed. The Code field in Activity holds a number related to a type of service performed. I want to count all of the services for a particular customer prior to the first occurrence of a code (23) which is a lockout. I need to find out how many services were performed before the lockout occurred.
From your description, the Activity table must have some column which determines the sequence that services were performed in. Otherwise, you would never know which services were performed before the row with Code 23. So let's call this column ServiceDate. We want to count Activity rows that have a ServiceDate that is earlier than the ServiceDate of the row with Code 23.
This sample data illustrates the situations we need to cover:
CustNum ServiceDate Code 6 2003-02-28 15 6 2003-03-15 78 6 2003-03-30 23 6 2003-04-29 01 7 2003-03-31 23 8 2003-01-05 21 8 2003-01-31 23 8 2003-02-01 66 8 2003-02-02 23 9 2003-01-15 44 9 2003-02-22 55
For CustNum 6, there are two rows with a ServiceDate earlier than the ServiceDate of the row with Code 23. For Custnum 7, there is none. For CustNum 8, though, there is one row before the first Code 23, and another (or three, depending on how you look at it) before the second; this illustrates the importance of considering sample data to uncover hidden "whoopsies" in logic. If we simply count the rows before a Code 23 we will get two answers for CustNum 8. So let's agree we want only the number of rows before the first Code 23.
For CustNum 9, there is no row with Code 23, so any expression involving its date would be NULL. This is another "whoopsie" and means we need to refine the requirement once again. We want the number of rows before the first Code 23, but if there is no Code 23, then we want all rows. Whew!
select CustNum, count(*) as services from Activity A where ServiceDate < ( select min(ServiceDate) from Activity where CustNum = A.CustNum and Code = 23 ) or not exists ( select 1 from Activity where CustNum = A.CustNum and Code = 23 ) group by CustNum
Both subqueries are correlated, and will examine only those rows related (via the correlation variable A) to each outer row. The first subquery has an aggregate function but requires no GROUP BY because all rows in each group have the same CustNum.
If there is a Code 23 in the group, the first subquery returns a date, which is compared to ServiceDate, yielding either TRUE or FALSE for every row. The NOT EXISTS condition evaluates FALSE but this does not come into play, because no matter what the date comparison evaluates as, that evaluation is unchanged by combining it with OR FALSE.
If there is no Code 23 in the group, the first subquery returns NULL, and the ServiceDate comparison is therefore NULL. However, the NOT EXISTS is TRUE. In SQL, NULL OR TRUE evaluates as TRUE, so all rows in the group are selected if there is no Code 23.
Note that the two subqueries are not quite identical. You could replace the literal 1 with MIN(ServiceDate), and the NOT EXISTS subquery would work exactly the same way, except then the subqueries would be identical, a subtle hint to the optimizer not to actually run two subqueries. You might also try combining the two OR conditions into one, using COALESCE, although my preference is for two conditions because, to me, they convey the business rule more clearly when stated separately.
Dig Deeper on Oracle and SQL
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.