Self JOIN
Join a table to itself to uncover relationships within the same dataset.
- Hierarchical Queries — traverse employee–manager trees and org charts
- Row Comparison — find duplicates, pairs, and running differences
- Alias-Driven — requires distinct aliases to distinguish table instances The same table, two perspectives.
What Is a Self JOIN?
DfSelf JOIN
A Self JOIN is not a separate join type but a technique where a table is joined with itself using different aliases. Each alias represents a distinct instance of the same table, allowing you to compare rows within the same dataset — such as matching employees to their managers or finding duplicate records.
SELECT
e.first_name AS employee,
m.first_name AS manager
FROM employees e
INNER JOIN employees m
ON e.manager_id = m.employee_id;
Always use meaningful aliases (e.g., e for employee, m for manager) to make self joins readable and maintainable.
Syntax
SELECT columns
FROM table_name alias1
[INNER|LEFT|RIGHT] JOIN table_name alias2
ON alias1.column = alias2.column;
| Element | Purpose |
|---|---|
alias1 | First instance of the table |
alias2 | Second instance of the table |
ON | Relates rows within the same table |
| Join type | INNER, LEFT, RIGHT, or CROSS — choose as needed |
How It Works
A Self JOIN treats one table as two separate entities. The database engine processes it like any other join, but the same physical table appears twice with different aliases.
| Use Case | Alias 1 | Alias 2 | ON Condition |
|---|---|---|---|
| Employee → Manager | e (employee) | m (manager) | e.manager_id = m.employee_id |
| Find duplicates | a | b | a.col = b.col AND a.id < b.id |
| Previous row | curr | prev | curr.date > prev.date |
Basic Example — Employee and Manager
SELECT
e.employee_id,
e.first_name AS employee_name,
m.employee_id AS manager_id,
m.first_name AS manager_name
FROM employees e
LEFT JOIN employees m
ON e.manager_id = m.employee_id
ORDER BY e.employee_id;
Use LEFT JOIN when some employees may not have a manager (top-level executives). INNER JOIN would exclude those rows entirely.
Multi-Level Hierarchy
-- Walk four levels of management hierarchy
SELECT
e.first_name AS employee,
m1.first_name AS direct_manager,
m2.first_name AS skip_manager,
m3.first_name AS director
FROM employees e
LEFT JOIN employees m1 ON e.manager_id = m1.employee_id
LEFT JOIN employees m2 ON m1.manager_id = m2.employee_id
LEFT JOIN employees m3 ON m2.manager_id = m3.employee_id;
Finding Duplicates
-- Find employees who share the same email address
SELECT
e1.first_name,
e1.last_name,
e1.email
FROM employees e1
INNER JOIN employees e2
ON e1.email = e2.email
AND e1.employee_id < e2.employee_id;
| Duplicate Check | Join Condition | Notes |
|---|---|---|
| Same email | e1.email = e2.email | Unique constraint violation |
| Same salary | e1.salary = e2.salary | Common comparison |
| Same department + role | e1.dept = e2.dept AND e1.role = e2.role | Composite check |
Comparing Current vs Previous Values
-- Compare each employee's current salary to their previous salary
SELECT
e.first_name,
jh1.start_date AS current_start,
jh1.salary AS current_salary,
jh2.start_date AS previous_start,
jh2.salary AS previous_salary,
jh1.salary - jh2.salary AS salary_change
FROM job_history jh1
INNER JOIN job_history jh2
ON jh1.employee_id = jh2.employee_id
AND jh1.start_date > jh2.start_date
WHERE NOT EXISTS (
SELECT 1
FROM job_history jh3
WHERE jh3.employee_id = jh1.employee_id
AND jh3.start_date > jh2.start_date
AND jh3.start_date < jh1.start_date
)
ORDER BY e.first_name, jh1.start_date;
Finding Pairs and Combinations
-- Find all pairs of employees in the same city
SELECT
e1.first_name AS employee1,
e2.first_name AS employee2,
e1.city
FROM employees e1
INNER JOIN employees e2
ON e1.city = e2.city
AND e1.employee_id < e2.employee_id
ORDER BY e1.city, e1.first_name;
-- Find products in the same category with similar prices
SELECT
p1.product_name AS product1,
p2.product_name AS product2,
p1.category,
ABS(p1.price - p2.price) AS price_difference
FROM products p1
INNER JOIN products p2
ON p1.category = p2.category
AND p1.product_id < p2.product_id
WHERE ABS(p1.price - p2.price) < 10;
The condition alias1.id < alias2.id prevents duplicate pairs and self-matches. Use <= if you need self-referencing rows.
Self JOIN with Different Join Types
-- LEFT Self JOIN: include employees without managers
SELECT
e.first_name AS employee,
m.first_name AS manager
FROM employees e
LEFT JOIN employees m
ON e.manager_id = m.employee_id;
-- CROSS Self JOIN: every possible pair (excluding self)
SELECT
e1.first_name AS employee1,
e2.first_name AS employee2
FROM employees e1
CROSS JOIN employees e2
WHERE e1.employee_id != e2.employee_id;
| Join Type | Use Case | Rows Returned |
|---|---|---|
| INNER | Only matched pairs | Matched rows only |
| LEFT | All left rows + matches | Includes unmatched left |
| RIGHT | All right rows + matches | Includes unmatched right |
| CROSS | All combinations | M × N rows (filtered with WHERE) |
Performance Considerations
| Factor | Recommendation |
|---|---|
| Indexes | Index all columns used in the ON condition |
| Aliases | Always use aliases — the optimizer needs them |
| Duplicate prevention | Use < or != to avoid redundant pairs |
| Large tables | Avoid CROSS Self JOIN on large datasets |
| Execution plan | Use EXPLAIN to verify index usage |
-- Recommended indexes for common Self JOIN patterns
CREATE INDEX idx_employees_manager_id ON employees(manager_id);
CREATE INDEX idx_employees_department_id ON employees(department_id);
CREATE INDEX idx_employees_email ON employees(email);
CREATE INDEX idx_job_history_employee_date ON job_history(employee_id, start_date);
Practice Exercises
Exercise 1: List every employee with their direct manager, including employees who have no manager.
-- Solution
SELECT
e.employee_id,
e.first_name AS employee_name,
m.employee_id AS manager_id,
m.first_name AS manager_name
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.employee_id
ORDER BY e.employee_id;
Exercise 2: Find all pairs of employees who work in the same department.
-- Solution
SELECT
e1.first_name AS employee1,
e2.first_name AS employee2,
d.department_name
FROM employees e1
INNER JOIN employees e2
ON e1.department_id = e2.department_id
AND e1.employee_id < e2.employee_id
INNER JOIN departments d
ON e1.department_id = d.department_id
ORDER BY d.department_name, e1.first_name;
Exercise 3: Find employees who earn more than their direct manager.
-- Solution
SELECT
e.first_name AS employee_name,
e.salary AS employee_salary,
m.first_name AS manager_name,
m.salary AS manager_salary,
e.salary - m.salary AS salary_difference
FROM employees e
INNER JOIN employees m ON e.manager_id = m.employee_id
WHERE e.salary > m.salary
ORDER BY salary_difference DESC;
Key Takeaways
- A Self JOIN joins a table to itself using different aliases.
- It is not a distinct join type — it uses INNER, LEFT, RIGHT, or CROSS joins.
- Common use cases include hierarchies, duplicate detection, and row-to-row comparison.
- Always use aliases and include conditions like
<to prevent duplicate pairs. - Index the columns used in the ON clause for best performance.