Q
Problem solve Get help with specific problems with your technologies, process and projects.

Averages over a span of years -- Part 1

For the following sample relation:

```subject | year | enrolled ----------+---------+------------- subject1 | 1998 | 20 subject1 | 1999 | 23 subject1 | 2000 | 16 subject2 | 1999 | 10 subject2 | 2000 | 21 subject3 | 2000 | 9
```

How would I create a query that calculates the average enrollment for each subject over the years? Thanks!

The answer depends on what is meant by an average "over the years."

Here's a solution involving a straightforward average calculation, using the AVG function:

```select subject , avg(enrolled) as avgamt from subjects group by subject
subject avgamt subject1 19.67 subject2 15.50 subject3 9.00
```

Everything looks okay, right? Each subject has one or more entries in the table, and the average was calculated as the sum per subject divided by the number of rows, right?

But what if the average needs to be calculated over all years in the span of years from 1998 to 2000? How do we deal with the fact that some subjects are missing some years?

What we could do is supply the missing years for each subject. There's more than one way to do this, but here's a simple one. The following query uses the integers table (described in Finding all the dates between two dates, 10 June 2002, and also in Aggregates for date ranges, 4 October 2002). The integers table is joined with the original table in a cross join to generate the desired range of years for each subject:

```select distinct subject , 1998+i as theyear from integers , subjects where i between 0 and 2
subject theyear subject1 1998 subject1 1999 subject1 2000 subject2 1998 subject2 1999 subject2 2000 subject3 1998 subject3 1999 subject3 2000
```

How did we know to use "1998+i" and "i between 0 and 2" in this query? By inspection. Actually, in the general case, inspection would not be used, and instead, additional subqueries would obtain the lowest and highest years from the sample data.

We can now use the results of this cross join as a derived table and join it to the original table. We want to use a left outer join, since we know some rows will not match:

```select allyears.subject , allyears.theyear , enrolled from ( select distinct subject , 1998+i as theyear from integers , subjects where i between 0 and 2 ) as allyears left outer join subjects on allyears.subject = subjects.subject and allyears.theyear = subjects.theyear order by allyears.subject , allyears.theyear
subject theyear enrolled subject1 1998 20 subject1 1999 23 subject1 2000 16 subject2 1998 - subject2 1999 10 subject2 2000 21 subject3 1998 - subject3 1999 - subject3 2000 9
```

Okay, that looks fine. So let's try the averages again:

```select allyears.subject , avg(enrolled) as avgamt from ( select distinct subject , 1998+i as theyear from integers , subjects where i between 0 and 2 ) as allyears left outer join subjects on allyears.subject = subjects.subject and allyears.theyear = subjects.theyear group by allyears.subject
subject avgamt subject1 19.67 subject2 15.50 subject3 9.00
```

Uh oh. These are our original results. How can this be?

The explanation is that aggregate functions exclude NULLs. Please see Part 2 of this answer for more information on working with NULLs and aggregates.

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

Start the conversation

Send me notifications when other members comment.

SearchDataManagement

• Hitachi Vantara acquires data catalog vendor Waterline Data

With the acquisition of Waterline Data, Hitachi Vantara is bringing new data catalog capabilities that will expand the Lumada ...

• New Confluent Platform release boosts event streaming quality

Based on the open-source Kafka event streaming platform, the Confluent Platform 5.4 update adds new capabilities to help meet ...

• Where InfluxDB time series database is going

Users need more than SQL for querying databases, according to Paul Dix, co-founder and CTO of InfluxData. That's why the vendor ...

SearchBusinessAnalytics

• How to compare and choose augmented analytics tools

Choosing an augmented BI and analytics vendor can be difficult when their offerings are so similar. Analysts at Gartner and ...

• ThoughtSpot IPO could be coming after vendor adds first CFO

Hiring of a CFO for the first time signals that ThoughtSpot may be positioning itself for an IPO and comes six months after what ...

• 5 ways enterprises adapt to the data scientist shortage

Where are all the data scientists? Coping with the data scientist shortage is a struggle for many enterprises. Here are five ways...

SearchSAP

• SAP S/4HANA migration should be business-driven

In this Q&A, Ekrem Hatip of Syntax Systems discusses what SAP customers need to consider as they embark on an SAP S/4HANA ...

• SAP S/4HANA migration: Critical advice for moving off ECC

With the end of SAP ECC support looming in 2025, organizations must make some tough decisions. Here's a look at your choices.

• New SAP leadership faces big challenges in 2020

Industry analysts discuss SAP's biggest issues in 2020, including how the two new CEOs will guide the company deeper into the ...

SearchSQLServer

• SQL Server database design best practices and tips for DBAs

Good database design is a must to meet processing needs in SQL Server systems. In a webinar, consultant Koen Verbeeck offered ...

• SQL Server in Azure database choices and what they offer users

SQL Server databases can be moved to the Azure cloud in several different ways. Here's what you'll get from each of the options ...

• Using a LEFT OUTER JOIN vs. RIGHT OUTER JOIN in SQL

In this book excerpt, you'll learn LEFT OUTER JOIN vs. RIGHT OUTER JOIN techniques and find various examples for creating SQL ...

TheServerSide.com

• How to properly perform Java String comparisons

Don't let incorrect Java String object comparisons bog down your code. Follow this tutorial to understand the differences between...

• Don't ever put a non-Java LTS release into production

Development teams should avoid non-long-term support releases at all costs. Pay attention to the Java release cycle to make sure ...

• Public API strategy considerations for enterprise adoption

As organizations look for more cost-effective ways to manage data, an evolving landscape with APIs has made the technology more ...

SearchDataCenter

• Top data center skills admins can use in 2020

The 2019 tech job sector saw consistent growth and job availability. In 2020, admins should develop expertise on cloud ...

• Organizations try to predict the effect of 5G infrastructure

With more 5G and IoT devices emerging, admins must prepare their data centers to support low-latency apps and edge computing with...

• Top infrastructure and operations technology myths of 2019

Admins are consistently evaluating technology to improve I&O efficiency. Cost, integration and business goals are key components ...

SearchContentManagement

• 4 popular content collaboration platforms to consider

Companies need to be organized if they want to be efficient. Content collaboration platforms are useful, but first, ensure that ...

• AI can enhance content security with a bit of planning

Microsoft and Box both use AI technologies to keep content secure in the cloud. But before using such tools, businesses first ...

• Ex-SAP exec steers Episerver CMS toward digital experience market

Alex Atzberger discusses leaving the helm of SAP's CX platform to become Episerver CEO. Now, Episerver looks to reinvent itself ...

SearchHRSoftware

• 6 candidate journey mapping secrets savvy recruiters know

A journey map can be a critical tool for uncovering what it takes to provide your candidates with great experiences. Here's ...

• Critical tips for managing contingent workers

Contingent workers save companies both time and money, so it's important to manage them in a win-win way. Here is what HR teams ...

• Impact of AI on jobs goes on the presidential campaign trail

The impact of AI on jobs is a major issue for employers, who are struggling with how to address it. Robots, automation and AI ...

Close