Introduction to CTEs
Name temporary result sets to break complex queries into logical, readable steps.
- Readability β Replace nested subqueries with named, self-documenting blocks
- Reusability β Reference the same CTE multiple times in one query
- Modularity β Decompose logic into independent, testable pieces CTEs make SQL queries as readable as pseudocode.
What Is a CTE?
DfCommon Table Expression (CTE)
A named temporary result set defined with the WITH clause that exists only for the duration of a single query. It simplifies complex queries by allowing you to reference a result set by name rather than repeating subqueries.
CTEs were introduced in SQL:1999 and are supported by PostgreSQL, MySQL 8+, SQL Server, Oracle, and SQLite 3.8.3+.
CTE vs Subquery
| Feature | CTE | Inline Subquery |
|---|---|---|
| Readability | High β named blocks | Low β nested, hard to follow |
| Reusability | Can reference multiple times | Must repeat the subquery |
| Debugging | Easy β test CTE independently | Difficult β embedded in larger query |
| Scope | Only within one query | Only within one query |
| Performance | Optimizer may inline | Optimizer may inline |
Basic Syntax
-- Simple CTE structure
WITH cte_name (column1, column2) AS (
-- CTE query definition
SELECT column1, column2
FROM source_table
WHERE condition
)
-- Main query referencing the CTE
SELECT *
FROM cte_name
ORDER BY column1;
You can chain multiple CTEs with commas after the WITH keyword. Each CTE can reference previously defined CTEs in the same WITH clause, enabling sequential data transformations.
Single CTE Example
-- Find all employees earning above the company average
WITH avg_salary AS (
SELECT AVG(salary) AS company_avg
FROM employees
)
SELECT
e.employee_name,
e.department,
e.salary,
a.company_avg,
ROUND(e.salary - a.company_avg, 2) AS diff_from_avg
FROM employees e
CROSS JOIN avg_salary a
WHERE e.salary > a.company_avg
ORDER BY e.salary DESC;
Chained CTEs
-- Multi-step transformation using chained CTEs
WITH
-- Step 1: Filter to active employees only
active_employees AS (
SELECT employee_id, employee_name, department, salary
FROM employees
WHERE status = 'Active'
),
-- Step 2: Calculate department statistics
dept_stats AS (
SELECT
department,
COUNT(*) AS dept_size,
AVG(salary) AS dept_avg,
MAX(salary) AS dept_max
FROM active_employees
GROUP BY department
),
-- Step 3: Combine employee data with department stats
combined AS (
SELECT
ae.employee_name,
ae.department,
ae.salary,
ds.dept_avg,
ds.dept_max,
ROUND(ae.salary - ds.dept_avg, 2) AS diff_from_dept_avg,
ROUND(ae.salary / ds.dept_max * 100, 1) AS pct_of_dept_max
FROM active_employees ae
JOIN dept_stats ds ON ae.department = ds.department
)
-- Final: Select and filter the results
SELECT *
FROM combined
WHERE diff_from_dept_avg > 0
ORDER BY department, salary DESC;
CTE for Aggregation
-- Department-level summary with CTE
WITH dept_summary AS (
SELECT
department,
COUNT(*) AS employee_count,
SUM(salary) AS total_salary,
AVG(salary) AS avg_salary,
MIN(salary) AS min_salary,
MAX(salary) AS max_salary
FROM employees
WHERE status = 'Active'
GROUP BY department
)
SELECT
department,
employee_count,
total_salary,
ROUND(avg_salary, 2) AS avg_salary,
min_salary,
max_salary,
ROUND(max_salary - min_salary, 2) AS salary_range
FROM dept_summary
WHERE employee_count >= 5
ORDER BY total_salary DESC;
CTE for Deduplication
-- Remove duplicate records keeping the latest entry
WITH ranked_records AS (
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY customer_id, product_id
ORDER BY created_at DESC
) AS rn
FROM order_items
)
SELECT
customer_id,
product_id,
quantity,
price,
created_at
FROM ranked_records
WHERE rn = 1;
CTEs are not materialized by default in most databases. Each reference to the CTE may re-execute the underlying query. If the CTE is referenced multiple times and performance matters, consider creating a temporary table instead.
CTE for Reporting
-- Sales report with category breakdown
WITH monthly_sales AS (
SELECT
DATE_FORMAT(sale_date, '%Y-%m') AS month,
category,
SUM(amount) AS total_sales,
COUNT(*) AS transaction_count
FROM sales
GROUP BY DATE_FORMAT(sale_date, '%Y-%m'), category
),
ranked_categories AS (
SELECT
month,
category,
total_sales,
transaction_count,
RANK() OVER (PARTITION BY month ORDER BY total_sales DESC) AS category_rank
FROM monthly_sales
)
SELECT
month,
category,
total_sales,
transaction_count,
category_rank
FROM ranked_categories
WHERE category_rank <= 3
ORDER BY month, category_rank;
Recursive CTE Preview
-- Employee hierarchy using recursive CTE (simplified)
WITH RECURSIVE org_chart AS (
-- Base case: top-level managers
SELECT employee_id, employee_name, manager_id, 1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive case: direct reports
SELECT e.employee_id, e.employee_name, e.manager_id, oc.level + 1
FROM employees e
JOIN org_chart oc ON e.manager_id = oc.employee_id
)
SELECT employee_id, employee_name, level
FROM org_chart
ORDER BY level, employee_name;
The RECURSIVE keyword is required in PostgreSQL and MySQL when using recursive CTEs. SQL Server uses a different approach without the keyword. Always test recursive CTEs with a maximum depth to prevent infinite loops.
Quick Reference
| Clause | Purpose | Example |
|---|---|---|
WITH | Begins CTE definition | WITH cte AS (...) |
RECURSIVE | Enables self-referencing | WITH RECURSIVE cte AS (...) |
UNION ALL | Combines base and recursive cases | Required in recursive CTEs |
| Column aliases | Optional naming | WITH cte (a, b) AS (...) |
Key Takeaways
- CTEs are named temporary result sets defined with the
WITHclause - They improve readability by replacing nested subqueries with logical named blocks
- Multiple CTEs can be chained with commas, each referencing the previous one
- CTEs are not materialized by default β repeated references may re-execute the query
- Recursive CTEs (WITH RECURSIVE) traverse hierarchical data like org charts and trees