Introduction to Window Functions
Perform calculations across sets of rows while preserving individual rows β no GROUP BY required.
- Row-Preserving β Results include every row from the original query
- PARTITION BY β Break data into logical groups for calculations
- ORDER BY β Define row sequence within each partition Window functions bridge the gap between aggregate and row-level queries.
What Are Window Functions?
DfWindow Function
A window function performs a calculation across a set of table rows that are related to the current row. Unlike aggregate functions, window functions do not collapse rows β each row remains in the output with its computed value.
Window functions operate on a "window" of rows defined by the OVER() clause. They are essential for tasks like ranking, running totals, moving averages, and comparing rows without using subqueries or self-joins.
Window Functions vs Aggregate Functions
| Feature | Aggregate Functions | Window Functions |
|---|---|---|
| Row Output | Collapses rows | Preserves all rows |
| GROUP BY | Required | Not required |
| OVER() Clause | Not used | Required |
| Calculation Scope | Entire group | Defined window |
| Example Use | SUM by department | Running total per employee |
Syntax Overview
DfWindow Function Syntax
function_name() OVER (PARTITION BY column ORDER BY column) β The PARTITION BY divides rows into groups, and ORDER BY defines the row ordering within each partition.
-- Basic window function structure
SELECT
employee_name,
department,
salary,
SUM(salary) OVER (PARTITION BY department) AS dept_total_salary,
salary - SUM(salary) OVER (PARTITION BY department) AS diff_from_dept_total
FROM employees;
-- Using multiple window functions in one query
SELECT
employee_name,
department,
salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS overall_rank,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank,
AVG(salary) OVER (PARTITION BY department) AS dept_avg_salary
FROM employees;
Window Function Categories
| Category | Functions | Purpose |
|---|---|---|
| Ranking | ROW_NUMBER, RANK, DENSE_RANK, NTILE | Assign positions to rows |
| Aggregate | SUM, AVG, COUNT, MIN, MAX | Compute running or group totals |
| Value | LAG, LEAD, FIRST_VALUE, LAST_VALUE | Access other rows in the window |
| Distribution | PERCENT_RANK, CUME_DIST, NTILE | Statistical distribution |
Practical Example: Salary Analysis
-- Comprehensive employee salary analysis using window functions
SELECT
e.employee_name,
e.department,
e.salary,
-- Rank employees within their department
RANK() OVER (
PARTITION BY e.department
ORDER BY e.salary DESC
) AS dept_salary_rank,
-- Compare to department average
ROUND(e.salary - AVG(e.salary) OVER (PARTITION BY e.department), 2) AS vs_dept_avg,
-- Running total of salaries in department
SUM(e.salary) OVER (
PARTITION BY e.department
ORDER BY e.salary DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_dept_total,
-- Percentage of department total
ROUND(e.salary / SUM(e.salary) OVER (PARTITION BY e.department) * 100, 1) AS pct_of_dept
FROM employees e
ORDER BY e.department, dept_salary_rank;
Always include an ORDER BY inside the OVER() clause when using aggregate window functions for running totals. Without it, the calculation applies to all rows in the partition simultaneously, giving the same value to every row.
Aggregate Window Functions
DfAggregate Window Function
An aggregate window function computes a summary value (SUM, AVG, COUNT) across the rows in the defined window and attaches that value to each row without collapsing the result set.
-- Running total and moving average
SELECT
order_date,
order_amount,
SUM(order_amount) OVER (ORDER BY order_date) AS running_total,
AVG(order_amount) OVER (
ORDER BY order_date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS moving_avg_3,
COUNT(*) OVER (ORDER BY order_date) AS cumulative_count
FROM orders;
Quick Reference Table
| Clause | Purpose | Required |
|---|---|---|
OVER() | Marks a window function | Yes |
PARTITION BY | Divides rows into groups | No |
ORDER BY | Orders rows within partition | No (but often needed) |
ROWS/RANGE | Defines frame boundaries | No |
Key Takeaways
- Window functions perform calculations across related rows without collapsing them
- The
OVER()clause defines the window β usePARTITION BYfor groups andORDER BYfor ordering - Window functions cannot be used in a
WHEREclause β use subqueries or CTEs instead - Categories include Ranking, Aggregate, Value, and Distribution functions
- Frame specification (ROWS/RANGE) controls which rows participate in the calculation