Recursive CTEs
Traverse hierarchies, graphs, and tree structures with self-referencing queries.
- Base Case β Starting point of the recursion (anchor member)
- Recursive Case β Self-referencing query that joins back to itself
- Termination β Stops when no more rows are produced Recursive CTEs solve problems that iterative code handles β entirely in SQL.
What Is a Recursive CTE?
DfRecursive CTE
A Common Table Expression that references itself, consisting of an anchor member (base case) combined with a recursive member using UNION ALL. The recursion continues until the recursive member returns no rows.
Recursive CTEs are essential for traversing hierarchical data like organizational charts, bill of materials, file systems, category trees, and network graphs.
Structure
| Component | Purpose | Example |
|---|---|---|
| Anchor Member | Starting rows (non-recursive) | Top-level managers |
| UNION ALL | Combines anchor and recursive results | Always required |
| Recursive Member | Joins back to the CTE | Find direct reports |
| Termination | Recursive member returns empty set | No more children found |
Basic Syntax
-- Recursive CTE skeleton
WITH RECURSIVE cte_name AS (
-- Anchor member: starting point
SELECT columns
FROM table
WHERE condition
UNION ALL
-- Recursive member: self-referencing
SELECT columns
FROM table
JOIN cte_name ON join_condition
)
SELECT * FROM cte_name;
Always set a recursion limit or include a termination condition. Infinite recursion will crash your query and consume excessive resources. Most databases have a default max recursion depth (e.g., PostgreSQL: 100, SQL Server: 100).
Example 1: Organizational Hierarchy
-- Employee hierarchy with levels and path
WITH RECURSIVE org_hierarchy AS (
-- Anchor: top-level employees (no manager)
SELECT
employee_id,
employee_name,
manager_id,
1 AS level,
employee_name AS hierarchy_path
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive: employees with managers
SELECT
e.employee_id,
e.employee_name,
e.manager_id,
oh.level + 1,
CONCAT(oh.hierarchy_path, ' β ', e.employee_name)
FROM employees e
JOIN org_hierarchy oh ON e.manager_id = oh.employee_id
)
SELECT
employee_id,
employee_name,
level,
hierarchy_path
FROM org_hierarchy
ORDER BY level, employee_name;
Output Example
| employee_name | level | hierarchy_path |
|---|---|---|
| CEO | 1 | CEO |
| VP Engineering | 2 | CEO β VP Engineering |
| VP Sales | 2 | CEO β VP Sales |
| Engineering Manager | 3 | CEO β VP Engineering β Engineering Manager |
| Developer 1 | 4 | CEO β VP Engineering β Engineering Manager β Developer 1 |
Example 2: Bill of Materials
-- Calculate total cost of a product including all components
WITH RECURSIVE bom AS (
-- Anchor: top-level product
SELECT
product_id,
component_id,
quantity,
unit_cost,
quantity * unit_cost AS total_cost,
1 AS depth
FROM bill_of_materials
WHERE product_id = 'FINAL_PRODUCT'
UNION ALL
-- Recursive: sub-components
SELECT
b.product_id,
bom.component_id,
b.quantity * bom.quantity,
bom.unit_cost,
b.quantity * bom.quantity * bom.unit_cost,
bom.depth + 1
FROM bill_of_materials b
JOIN bom ON b.product_id = bom.component_id
WHERE bom.depth < 10 -- Safety limit
)
SELECT
component_id,
SUM(total_cost) AS component_total_cost
FROM bom
GROUP BY component_id
ORDER BY component_total_cost DESC;
The depth < 10 condition prevents infinite recursion in case of circular references in the BOM. Always include a depth limit when the data might contain cycles.
Example 3: Category Tree Traversal
-- Flatten a nested category hierarchy
WITH RECURSIVE category_tree AS (
-- Anchor: root categories
SELECT
category_id,
category_name,
parent_id,
1 AS depth,
category_name AS full_path
FROM categories
WHERE parent_id IS NULL
UNION ALL
-- Recursive: child categories
SELECT
c.category_id,
c.category_name,
c.parent_id,
ct.depth + 1,
CONCAT(ct.full_path, ' > ', c.category_name)
FROM categories c
JOIN category_tree ct ON c.parent_id = ct.category_id
WHERE ct.depth < 5
)
SELECT
category_id,
category_name,
depth,
full_path
FROM category_tree
ORDER BY full_path;
Example 4: Date Series Generation
-- Generate a series of consecutive dates
WITH RECURSIVE date_series AS (
-- Anchor: start date
SELECT '2024-01-01' AS series_date
UNION ALL
-- Recursive: add one day
SELECT DATE_ADD(series_date, INTERVAL 1 DAY)
FROM date_series
WHERE series_date < '2024-12-31'
)
SELECT series_date
FROM date_series;
Example 5: Graph Traversal
-- Find all reachable nodes from a starting point in a graph
WITH RECURSIVE graph_walk AS (
-- Anchor: starting node
SELECT
node_id,
node_name,
0 AS distance,
CAST(node_name AS CHAR(500)) AS path
FROM nodes
WHERE node_id = 1
UNION ALL
-- Recursive: follow edges
SELECT
n.node_id,
n.node_name,
gw.distance + 1,
CONCAT(gw.path, ' β ', n.node_name)
FROM nodes n
JOIN edges e ON n.node_id = e.target_node_id
JOIN graph_walk gw ON e.source_node_id = gw.node_id
WHERE gw.distance < 5
AND NOT FIND_IN_SET(n.node_id, REPLACE(gw.path, ' β ', ','))
)
SELECT node_id, node_name, distance, path
FROM graph_walk
ORDER BY distance, node_name;
Graph traversal can produce exponential result sets. Always limit depth and consider cycle detection. The FIND_IN_SET trick above prevents revisiting nodes but adds overhead β use with caution on large graphs.
Cycle Detection
DfCycle Detection
A technique to prevent infinite recursion by tracking visited nodes. Essential when the data model allows cycles (e.g., AβBβCβA).
-- Safe recursive CTE with cycle detection
WITH RECURSIVE traversal AS (
SELECT
node_id,
ARRAY[node_id] AS visited_path,
0 AS depth
FROM nodes
WHERE node_id = 'START'
UNION ALL
SELECT
n.node_id,
t.visited_path || n.node_id,
t.depth + 1
FROM nodes n
JOIN edges e ON n.node_id = e.target_node_id
JOIN traversal t ON e.source_node_id = t.node_id
WHERE t.depth < 10
AND n.node_id <> ALL(t.visited_path) -- Cycle detection
)
SELECT node_id, visited_path, depth
FROM traversal
ORDER BY depth;
Best Practices
| Practice | Reason |
|---|---|
| Always include depth limit | Prevent infinite recursion |
| Detect cycles in graph data | Avoid infinite loops |
| Use UNION ALL (not UNION) | UNION deduplicates, breaking recursion |
| Test with small datasets first | Recursive CTEs can be expensive |
| Add safety limits | Default recursion limits vary by database |
Recursive CTE vs Alternatives
| Approach | Pros | Cons |
|---|---|---|
| Recursive CTE | Pure SQL, readable | May hit recursion limits |
| Iterative code (app layer) | Full control, no limits | More code, network overhead |
| Nested sets model | Fast reads | Complex writes |
| Materialized path | Fast ancestry lookups | Path string manipulation |
Key Takeaways
- Recursive CTEs consist of an anchor member, UNION ALL, and a self-referencing recursive member
- The recursion terminates when the recursive member returns no rows
- Always include a depth limit and cycle detection for safety
- Common use cases: org hierarchies, bill of materials, category trees, date series, graph traversal
- Use
UNION ALLβ usingUNIONinstead will silently break recursion by deduplicating intermediate results