Systems Engineering and RDBMS

Archive for August 18th, 2008

Back to the basics: NULL values and Aggregate functions

Posted by decipherinfosys on August 18, 2008

One common mistake for people new to SQL is remembering how the NULL values get treated when using aggregate functions.  Here is an example that helps illustrate the point:

declare @t table (col1 int)
insert into @t values (null)
insert into @t values (2)
insert into @t values (4)
AVG(col1) as Col_Avg,
COUNT(col1) as Col_Count,
COUNT(*) as Total_Count,
MAX(col1) as MAX_value,
MIN (col1) as MIN_value,
SUM(col1) as Total
from @t

Col_Avg     Col_Count   Total_Count MAX_value   MIN_value   Total
----------- ----------- ----------- ----------- ----------- -----------
3           2           3           4           2           6

Warning: Null value is eliminated by an aggregate or other SET operation.

As you can see from the Warning, the engine immediately tells you that the Null value was eliminated by an aggregate operation.  If you see the first column, you will see that the average is 3 and not 2 since it eliminated the row with the NULL value in it.  Likewise, when we did a COUNT(col1) vs a COUNT(*), there is a difference in the count – this again is because the NULL value has been eliminated by the aggregate function.

These are very important things to remember else you can end up writing code which will give out wrong results.  One of the ways to not fall into such issues is to address these issues right at the design time and if you have a numeric column, then define that as a not null column with a default value of 0.  That way, in the absence of the data, you can treat that as a 0 value and then the average would return the correct/intended data set – likewise for the count(col1) when the col1 is not null.

Posted in DB2 LUW, Oracle, SQL Server | Leave a Comment »