UNION, INTERSECT, EXCEPT
Merge, find commonalities, and subtract result sets from multiple queries.
- UNION β combines rows from two queries, optionally removing duplicates
- INTERSECT β returns only rows present in both result sets
- EXCEPT β returns rows from the first set that are absent in the second Set theory brought to your data.
What Are Set Operations?
DfSet Operations
SQL set operations combine the results of two or more SELECT statements into a single result set. UNION merges all rows, INTERSECT returns only common rows, and EXCEPT returns rows from the first query that are not in the second. All participating queries must have the same number of columns with compatible data types.
SELECT columns FROM table_a
UNION
SELECT columns FROM table_b;
Each SELECT statement in a set operation must return the same number of columns, and corresponding columns must have compatible data types. ORDER BY can only appear at the very end.
UNION
UNION vs UNION ALL
| Operation | Duplicates | Performance | Use When |
|---|---|---|---|
UNION | Removed | Slower (sorting/dedup) | You need unique results |
UNION ALL | Kept | Faster (no dedup) | Duplicates are acceptable |
-- UNION: removes duplicate rows
SELECT city FROM customers
UNION
SELECT city FROM suppliers;
-- UNION ALL: keeps all rows including duplicates
SELECT city FROM customers
UNION ALL
SELECT city FROM suppliers;
Basic UNION Example
-- Combine employee and customer names into one list
SELECT first_name, last_name, 'Employee' AS type
FROM employees
UNION
SELECT first_name, last_name, 'Customer' AS type
FROM customers
ORDER BY last_name, first_name;
UNION with Multiple Queries
-- Quarterly sales summary
SELECT 'Q1' AS quarter, SUM(sales) AS total_sales FROM sales WHERE quarter = 1
UNION
SELECT 'Q2', SUM(sales) FROM sales WHERE quarter = 2
UNION
SELECT 'Q3', SUM(sales) FROM sales WHERE quarter = 3
UNION
SELECT 'Q4', SUM(sales) FROM sales WHERE quarter = 4;
INTERSECT
Returns only the rows that appear in both result sets.
-- Find customers who are also employees
SELECT first_name, last_name, email
FROM customers
INTERSECT
SELECT first_name, last_name, email
FROM employees;
INTERSECT with Multiple Columns
-- Find employees whose current department and salary match a historical record
SELECT department_id, salary
FROM employees
INTERSECT
SELECT department_id, salary
FROM job_history;
| Scenario | INTERSECT Use |
|---|---|
| Common customers | Customers present in both 2022 and 2023 orders |
| Shared skills | Employees who have both certifications |
| Overlapping data | Records matching across two data sources |
-- Find products that were ordered in both January and February
SELECT product_id FROM order_items
INNER JOIN orders ON order_items.order_id = orders.order_id
WHERE MONTH(order_date) = 1
INTERSECT
SELECT product_id FROM order_items
INNER JOIN orders ON order_items.order_id = orders.order_id
WHERE MONTH(order_date) = 2;
EXCEPT (or MINUS)
Returns rows from the first query that do not appear in the second.
-- Customers who have never placed an order
SELECT customer_id, first_name, last_name
FROM customers
EXCEPT
SELECT DISTINCT c.customer_id, c.first_name, c.last_name
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id;
EXCEPT vs NOT IN vs NOT EXISTS
-- Using EXCEPT
SELECT customer_id FROM customers
EXCEPT
SELECT customer_id FROM orders;
-- Using NOT IN
SELECT customer_id FROM customers
WHERE customer_id NOT IN (SELECT customer_id FROM orders);
-- Using NOT EXISTS (often best performance)
SELECT c.customer_id FROM customers c
WHERE NOT EXISTS (
SELECT 1 FROM orders o WHERE o.customer_id = c.customer_id
);
| Method | Pros | Cons |
|---|---|---|
EXCEPT | Clean, set-based syntax | Not supported everywhere |
NOT IN | Simple to write | Issues with NULLs |
NOT EXISTS | Best performance, NULL-safe | Slightly more verbose |
Rules for Set Operations
| Rule | Description |
|---|---|
| Same column count | All queries must return the same number of columns |
| Compatible types | Corresponding columns must have compatible data types |
| Column order matters | Columns are matched by position, not name |
| ORDER BY at the end | Only one ORDER BY, applied to the final result |
| Aliases from first query | Column names come from the first SELECT |
-- Valid: same number of compatible columns
SELECT first_name, last_name, salary
FROM employees
UNION
SELECT first_name, last_name, NULL AS salary
FROM customers;
-- Invalid: different column counts (error)
SELECT first_name, last_name
FROM employees
UNION
SELECT first_name, last_name, email
FROM customers;
When columns don't naturally align, use NULL with an alias to pad the shorter column list. The data types must still be compatible.
Combining Set Operations
-- Combine UNION and EXCEPT in a single query
SELECT first_name, last_name, 'Employee' AS type FROM employees
UNION ALL
SELECT first_name, last_name, 'Customer' AS type FROM customers
EXCEPT
SELECT first_name, last_name, 'Inactive' AS type FROM inactive_users
ORDER BY last_name, first_name;
Performance Considerations
| Factor | Recommendation |
|---|---|
UNION ALL vs UNION | Prefer UNION ALL when duplicates are fine β avoids expensive dedup |
| Index usage | Set operations may bypass indexes β consider temp tables |
| Sorting cost | UNION requires sorting for duplicate removal |
| Large datasets | Use temporary tables for complex multi-step set operations |
| Execution plan | Always EXPLAIN to check if indexes are used |
-- When UNION ALL is sufficient, always use it
SELECT customer_id FROM orders_2022
UNION ALL
SELECT customer_id FROM orders_2023;
-- Only use UNION if you truly need deduplication
SELECT customer_id FROM orders_2022
UNION
SELECT customer_id FROM orders_2023;
Practice Exercises
Exercise 1: Combine employee and customer names into a single deduplicated list.
-- Solution
SELECT first_name, last_name, 'Employee' AS type
FROM employees
UNION
SELECT first_name, last_name, 'Customer' AS type
FROM customers
ORDER BY last_name, first_name;
Exercise 2: Find products that were ordered in both 2022 and 2023.
-- Solution
SELECT product_id FROM order_items
INNER JOIN orders ON order_items.order_id = orders.order_id
WHERE YEAR(order_date) = 2022
INTERSECT
SELECT product_id FROM order_items
INNER JOIN orders ON order_items.order_id = orders.order_id
WHERE YEAR(order_date) = 2023;
Exercise 3: Find customers who registered in 2023 but have never placed an order.
-- Solution
SELECT customer_id, first_name, last_name
FROM customers
WHERE YEAR(registration_date) = 2023
EXCEPT
SELECT DISTINCT c.customer_id, c.first_name, c.last_name
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id;
Key Takeaways
- UNION combines rows from two queries; use UNION ALL to skip duplicate removal and improve performance.
- INTERSECT returns only rows that exist in both result sets.
- EXCEPT returns rows from the first set that are missing from the second.
- All queries in a set operation must have the same number of columns with compatible types.
- ORDER BY can only appear once at the very end of the combined query.