I am struggling with writing a query to count consecutive years from the current year. I do not want to use a cursor. Our data looks something like this:

ID  DateCol
1 02/01/2006
1 01/01/2006
1 01/01/2005
1 01/01/2004
1 01/01/1999
2 02/01/2006
2 01/01/2005
3 04/01/2006
3 04/01/1999
4 06/30/2000
4 08/01/1999

My desired output would be something like this:

ID ConYears
1    3
2    2
3    1

There is no requirement for the dates to be 365 days apart, just that they have different years. Any thoughts would be appreciated.

    Requires Free Membership to View

What a nice problem. There are probably ways to do it with just the single table, using a self-join technique similar to the one described in Gaps in sequences (9 February 2004). However, this could get complicated, because you said "consecutive years from the current year" and the first gap could be the current year, which the self-join wouldn't know of.

Once again we can use the handy-dandy integers table to help. If you don't have an integers table, you should; it's small, efficient and very useful.

create table integers
(i integer not null )

insert into integers values 
 (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)

The integers table can then be used to generate the last 10 years:

select year(current_date) - i as yr
  from integers

yr
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997

The solution uses a CROSS JOIN and a LEFT OUTER JOIN. The CROSS JOIN creates all possible combinations of ID and year. Then the LEFT OUTER JOIN attempts to match each such ID and year combination to a row in the data table.

select X.ID
     , max(X.yr) as FirstMissing
     , year(current_date) 
      -max(X.yr) as ConYears
  from (
       select ID
            , year(current_date) - i as yr
         from integers
       cross
         join (
              select distinct ID
                from datatable
              ) as I     
       ) as X
left outer
  join datatable as T
    on T.ID = X.ID
   and year(T.DateCol) = X.yr
 where T.ID is null
group 
    by X.ID
having year(current_date) 
      -max(X.yr)          > 0

The derived table called X contains each combination of ID and year. This is the left table in the outer join, and it is joined to the data table, such that it matches the ID and year of the data. Note that it doesn't matter if more than one row of the data table matches, as is the case in your original data for ID 1 and year 2006.

Where a matching row is not found, using the IS NULL condition in the WHERE clause, that combination of ID and year is retained (matching rows are filtered out), and then, using a GROUP BY, only the maximum year which was not found for each ID is chosen, and the number of consecutive years calculated for each ID. Finally, the HAVING clause rejects any IDs like 4 which had 0 consecutive years from the current year.

ID FirstMissing ConYears
1      2003        3
2      2004        2
3      2005        1

Pretty neat, eh?

This was first published in January 2006

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: