COUNT, SUM, AVG
Master the three most essential aggregation functions in SQL.
- COUNT β how many rows exist
- SUM β the total of all values
- AVG β the arithmetic mean These three functions handle 80% of aggregation needs.
COUNT Function
DfCOUNT
COUNT returns the number of rows that match a condition. COUNT(*) counts all rows; COUNT(column) counts non-NULL values.
-- Count all rows in a table
SELECT COUNT(*) AS total_employees
FROM employees;
-- Count non-NULL values in a specific column
SELECT COUNT(email) AS employees_with_email,
COUNT(*) - COUNT(email) AS missing_email
FROM employees;
Use COUNT(*) for row counts and COUNT(column) to check data completeness. The difference tells you how many NULLs exist.
COUNT with GROUP BY
-- Count employees per department
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department
ORDER BY employee_count DESC;
-- Count unique values per group
SELECT department, COUNT(DISTINCT job_title) AS unique_roles
FROM employees
GROUP BY department;
SUM Function
DfSUM
SUM adds up all numeric values in a column. It ignores NULL values automatically.
-- Total payroll
SELECT SUM(salary) AS total_payroll
FROM employees;
-- Department budgets
SELECT department, SUM(salary) AS department_cost
FROM employees
GROUP BY department
ORDER BY department_cost DESC;
-- Sum with conditions
SELECT department,
SUM(CASE WHEN status = 'active' THEN salary ELSE 0 END) AS active_cost,
SUM(CASE WHEN status = 'terminated' THEN salary ELSE 0 END) AS terminated_cost
FROM employees
GROUP BY department;
SUM will return NULL if all values are NULL. Use COALESCE(SUM(col), 0) to guarantee a numeric result.
AVG Function
DfAVG
AVG calculates the arithmetic mean of a set of values. It sums all non-NULL values and divides by the count of non-NULL values.
-- Company-wide average salary
SELECT AVG(salary) AS average_salary
FROM employees;
-- Average salary per department
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
ORDER BY avg_salary DESC;
-- Rounded averages for readability
SELECT department,
ROUND(AVG(salary), 2) AS avg_salary
FROM employees
GROUP BY department;
Combining COUNT, SUM, and AVG
-- Comprehensive department report
SELECT department,
COUNT(*) AS headcount,
SUM(salary) AS total_cost,
ROUND(AVG(salary), 2) AS avg_salary,
MIN(salary) AS min_salary,
MAX(salary) AS max_salary
FROM employees
GROUP BY department
ORDER BY headcount DESC;
Combine multiple aggregates in a single query to reduce database round trips and ensure consistency.
COUNT with Conditional Logic
-- Count active vs inactive employees per department
SELECT department,
COUNT(*) AS total,
SUM(CASE WHEN status = 'active' THEN 1 ELSE 0 END) AS active,
SUM(CASE WHEN status = 'terminated' THEN 1 ELSE 0 END) AS inactive
FROM employees
GROUP BY department;
-- Percentage calculation
SELECT department,
COUNT(*) AS total,
ROUND(SUM(CASE WHEN hire_date >= '2023-01-01' THEN 1 ELSE 0 END) * 100.0 / COUNT(*), 1) AS new_hire_pct
FROM employees
GROUP BY department;
SUM with JOIN
-- Total sales per customer
SELECT c.customer_name, SUM(o.amount) AS total_spent
FROM customers c
JOIN orders o ON c.id = o.customer_id
GROUP BY c.customer_name
ORDER BY total_spent DESC;
AVG with Window Functions
-- Compare each department's average to company average
SELECT department,
AVG(salary) AS dept_avg,
AVG(AVG(salary)) OVER () AS company_avg,
AVG(salary) - AVG(AVG(salary)) OVER () AS difference
FROM employees
GROUP BY department;
NULL Handling Deep Dive
-- Demonstrate NULL behavior
SELECT COUNT(*) AS total_rows,
COUNT(commission_pct) AS non_null_commissions,
SUM(commission_pct) AS total_commission,
AVG(commission_pct) AS avg_commission
FROM employees;
| Function | NULL Handling |
|---|---|
| COUNT(*) | Counts all rows including NULLs |
| COUNT(col) | Ignores NULL values |
| SUM(col) | Ignores NULL values |
| AVG(col) | Ignores NULL values |
Common Mistakes
Avoid these common pitfalls: using COUNT(column) when you need COUNT(*), forgetting that AVG ignores NULLs, and not handling division by zero in percentage calculations.
-- Wrong: This excludes rows with NULL commission
SELECT COUNT(commission_pct) FROM employees;
-- Right: This counts all rows
SELECT COUNT(*) FROM employees;
-- Safe division for percentages
SELECT department,
CASE WHEN COUNT(*) > 0
THEN ROUND(SUM(CASE WHEN gender = 'F' THEN 1 ELSE 0 END) * 100.0 / COUNT(*), 1)
ELSE 0
END AS female_pct
FROM employees
GROUP BY department;
Performance Considerations
Key Takeaways
- COUNT(*) is optimized in all databases and is fastest for row counting.
- COUNT(DISTINCT col) requires more memory; use it only when needed.
- SUM and AVG ignore NULLs automatically; use COALESCE for safety.
- Combine multiple aggregates in one query for efficiency.
- Use CASE expressions inside SUM/COUNT for conditional aggregation.
| Function | Best Use | Watch Out |
|---|---|---|
| COUNT(*) | Total row count | Nothing special |
| COUNT(col) | Non-NULL count | Excludes NULLs |
| SUM(col) | Numeric totals | Returns NULL if all NULL |
| AVG(col) | Mean values | Skewed by outliers |
| COUNT(DISTINCT) | Unique counts | Memory intensive |