Q
Problem solve Get help with specific problems with your technologies, process and projects.

# Averages over a span of years -- Part 2

In  Part 1 of this answer we examined averages produced by the AVG function on the following table:

```subject | year | enrolled ----------+---------+------------- subject1 | 1998 | 20 subject1 | 1999 | 23 subject1 | 2000 | 16 subject2 | 1999 | 10 subject2 | 2000 | 21 subject3 | 2000 | 9
```

The averages for each subject were the same whether we supplied the missing years or not. Here we'll explore why, and how to work with NULLs and aggregates.

When we supplied the missing years and found that averages were not affected, we demonstrated that aggregate functions exclude NULLs. The average was calculated using the number of non-NULL values in each subject group.

To get averages calculated over all years in the given span, where enrollment rows are missing, we must assume that the number enrolled was zero. This may not be a valid assumption in all applications. Remember, NULL is not equal to any value, and in particular, NULL is not equal to zero, so we have to do something specific to make it work that way.

The obvious solution is to check when the column is NULL, and use zero instead, which is completely in line with the assumption we are making and is the best way to solve the problem. We simply use the COALESCE function:

```select allyears.subject , avg(enrolled) as avgamt , avg(coalesce(enrolled,0)) as avgamtzero from ( select distinct subject , 1998+i as theyear from integers , subjects where i between 0 and 2 ) as allyears left outer join subjects on allyears.subject = subjects.subject and allyears.theyear = subjects.theyear group by allyears.subject
subject avgamt avgamtzero subject1 19.67 19.67 subject2 15.50 10.33 subject3 9.00 3.00
```

COALESCE is a standard SQL function. If your database does not support it, look for an equivalent function like ISNULL or NVL.

We made the correct calculation over the span of years by generating rows that were missing. Was this the easiest way? Consider the following query:

```select allyears.subject , sum(enrolled) as sumamt , count(enrolled) as countamt , count(*) as countrows , avg(enrolled) as avgamt , avg(coalesce(enrolled,0)) as avgamtzero from ( select distinct subject , 1998+i as theyear from integers , subjects where i between 0 and 2 ) as allyears left outer join subjects on allyears.subject = subjects.subject and allyears.theyear = subjects.theyear group by allyears.subject
subject sumamt countamt countrows avgamt avgamtzero subject1 59 3 3 19.67 19.67 subject2 31 2 3 15.50 10.33 subject3 9 1 3 9.00 3.00
```

Notice that COUNT(*) counts rows, whether any particular column had NULLs or not. COUNT(*) ignores NULLs. This is the only exception to the rule that aggregate functions exclude NULLs, because COUNT(*) does not even look at columns.

So the other way of getting the desired averages is:

``` sum(enrolled) / count(*) as avgamtzero
```

I can't say whether this method is as efficient as AVG(COALESCE(xxx)), but my guess is that they are approximately the same.

Finally, there is one other approach. Instead of generating rows that were missing with a cross join to the integers table, just do this:

```select subject , sum(enrolled) / 3 as avgamt from subjects group by subject
subject avgamt subject1 19.67 subject2 10.33 subject3 3.00
```

Is this reasonable? Can we just plug 3 into the calculation for the range of years desired? The derived table subquery was fashioned using inspection to determine the range of years. In general, it's not that easy. The generation of missing rows using the cross join would have used additional subqueries to determine the first and last years, so that inspection would not be necessary. If you can do it by inspection, okay, but it's nice to know how to attack the general problem, too.

This was last published in November 2002

## Content

Find more PRO+ content and other member only offers, here.

#### Have a question for an expert?

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

#### Start the conversation

Send me notifications when other members comment.

## SearchDataManagement

• ### NewSQL databases rise anew -- MemSQL, Spanner among contenders

The NewSQL database was almost hidden when Hadoop and NoSQL arose. Now, as more big data teams move toward production uses, ...

• ### Good data quality for analytics becomes an IT imperative

High-quality data is a must for analytics applications. That's driving more demand for data quality tools, but quality ...

• ### Data quality for big data should include a focus on usability

To help make big data analytics applications more effective, IT teams need to augment conventional data quality processes with ...

• ### Tableau data governance policies key to enterprise use

Data governance policies are key to effectively implementing Tableau and avoiding common pitfalls that can often affect ...

• ### Better Tableau implementation gives BI dashboards a boost

Building good Tableau dashboards is about more than just pretty visualizations. Users say the process should incorporate strong ...

• ### Tableau data visualization tool gets enterprise upgrade

Tableau is adding low-level capabilities to raise its software's standing above that of a data visualization tool for business ...

## SearchSAP

• ### SAP boosts data integration with SAP Data Hub and Vora

SAP Data Hub and Vora are both data integration tools, but Data Hub has a broad mission to manage data from different sources, ...

• ### User experience analytics tool helps Graybar improve support

Electric supply firm Graybar improved help desk operations and app performance for its SAP Suite on HANA system when it got the ...

• ### SAP promotes blockchain services, suggests IoT use cases

Blockchain use cases for business are still limited, but SAP believes the new SAP Leonardo Blockchain Co-innovation program will ...

## SearchSQLServer

• ### Dissect the SQL Server on Linux high availability features

SQL Server 2017 on Linux gives IT shops greater flexibility, but there are some limitations and changes to the way high ...

• ### Microsoft SQL Server 2017 for Linux hits GA, IT pros encouraged

Microsoft SQL Server 2017 becomes available in October. In addition to a version that runs on Linux, new features include support...

• ### SQL Server graph database tools map out data relationships

Get equipped to take advantage of the addition of graph database features in SQL Server 2017 to use graph structures to represent...

## TheServerSide.com

• ### Migrations to Oracle's Java SE 9 platform may be delayed

Oracle did a great job getting Java SE 9 released earlier this year, but modularity and various smaller updates may not be enough...

• ### Java 18.3 marks the future of Java at JavaOne 2017

At JavaOne 2017, Oracle identified four projects that will have a significant impact on the future of Java: Project Valhalla, ...

• ### How blockchain security is driving digital transformations

Whether it is a secure cloud, a secure mobile device or a secure IOT interaction, organizations are making blockchain security a ...

## SearchDataCenter

• ### Words to go: HCI platforms

Implementing HCI systems in the data center is a big undertaking for IT. Learn the basics of this emerging technology and its ...

• ### Software-defined memory trends yield speed, high performance

A new class of memory technologies is coming to the data center landscape. Educate yourself on the emerging tech, including tools...

• ### Explore uses for virtual data center architecture with VMC on AWS

The popularity of a virtual data center has risen because of the VMware Cloud on AWS announcement at VMworld 2017. But which ...

## SearchContentManagement

• ### Slack vs. Teams vs. Spark: Which is the best collaboration tool?

Ever wonder how the leading cloud collaboration tools stack up against each other? Our comprehensive chart pits platforms from ...

• ### Experts: Updating customer digital experience is a tall task

Gone are the days of a quick website launch. According to speakers at the Acquia Engage conference, redesigning a website is now ...

• ### Content personalization fuses marketing automation, content management

As the standards get higher for digital experiences, content personalization engines could be the answer for faster and better ...

## SearchFinancialApplications

• ### Finance IT case study: Reporting secrets of Derek Rose

CEO Sacha Rose says specialist reporting tools have saved the company thousands by avoiding unnecessary mistakes.

• ### WestJet turns to gamification to help its Oracle ERP users soar

WestJet's initial gamification project focuses on expense reporting.

• ### The Transformation of HR is Underway

HR is being transformed while we watch.

Close