I have a single (Oracle 9i) table which captures events with the following columns:
- Source (VARCHAR)
- Event (VARCHAR)
- Time_Of_Event (timestamp)
- a number of other detail columns, call them Detail1, Detail2, etc.
I want to produce a summary report with a single row entry for each unique source||event combination with the following fields:
- Number of occurrences of this event from this source
- Earliest occurrence of this event from this source
My first attempt was to SELECT all columns and COUNT(source||event) and MIN(time_of_event). The problem I have is that as soon as I use these aggregate functions I have to add a GROUP BY clause, and the minute I do that, the count seems to fail -- that is, I get a count of 1 for each row in the result set.
I tried using ROLLUP and GROUPING, which got me a little closer, but it introduced the problem of generating a massive number of output rows.
My current thinking is that I will need to do a subquery and some sort of join on the result set from that and the original table.
Perhaps this is a simple query, but as a someone moving to Oracle from MySQL it has stumped me for a couple of days now.
The answer to your problems is really simple. You must learn to write valid GROUP BY queries. The rules of thumb that you need to remember are the following:
If the SELECT clause contains only aggregate expressions, you do not need a GROUP BY clause. The classic example of this is:
select count(*) as rows from daTable
If the SELECT clause contains any non-aggregate expressions, each must be in the GROUP BY clause. A simple example is:
select dept , count(*) as rows from daTable group by dept
Your problem can be traced directly to the inclusion of the Detail1 and Detail2 columns in the SELECT clause. Oracle will force you to include them in the GROUP BY clause, and consequently you get many result rows with a count of 1.
It appears that you have been led astray by MySQL, which happily will execute invalid GROUP BY queries. That is not a typo. MySQL will execute invalid GROUP BY queries.
If this sounds unreasonable, on one level it certainly is. Countless MySQL developers have fallen victim to this gotcha.
To be fair, we can understand what MySQL was trying to achieve -- performance. Their explanation for this unhappy situation is here:
In particular, this paragraph:
Do not use this feature if the columns you omit from the GROUP BY part are not constant in the group. The server is free to return any value from the group, so the results are indeterminate unless all values are the same.
Of course, Detail1 and Detail2 are probably not constant in all rows in each of your groups, consequently your MySQL query result set will contain indeterminate values for those columns (i.e. you won't know which source/event they came from). But Oracle won't let you get away with those shenanigans.
So, what to do? The answer is to re-think your query.
Let me turn this around and ask you a similar question. Let's say we have a school with a number of classrooms, and in each classroom we have a number of students. Give me a query which will return one row per classroom with the following data:
- classroom number
- average student age
- student name
As soon as you figure out what the problem is here, you will be on your way.
This was first published in January 2007