πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Subqueries

Advanced SQLSubqueries🟒 Free Lesson

Advertisement

Advanced SQL

Subqueries

Nest one query inside another to perform complex data retrieval with scalar, row, table, and correlated subqueries.

  • Scalar Subqueries β€” Return a single value for use in comparisons
  • Table Subqueries β€” Generate derived tables for filtering and joining Think of subqueries as asking a question about the answer to another question.

What Is a Subquery?

DfSubquery

A subquery is a SELECT statement embedded inside another SQL statement, enclosed in parentheses. The outer query uses the subquery's result as a value, set of values, or temporary table.

SELECT * FROM employeesWHERE dept_id IN (SELECT id FROM departments WHERE name = 'Engineering'-- Subquery returns: (1, 2, 3));Inner query runs first β†’ result feeds outer query
-- Basic subquery structure
SELECT column1, column2
FROM table1
WHERE column3 IN (
    SELECT column FROM table2 WHERE condition
);

Subqueries can appear in four locations:

  • WHERE β€” Filter rows based on subquery results
  • FROM β€” Create a derived table (subquery factoring)
  • SELECT β€” Return a scalar value per row
  • HAVING β€” Filter groups based on aggregated subquery results

Scalar Subqueries

DfScalar Subquery

Returns exactly one row and one column. Used anywhere a single value is expected β€” in SELECT, WHERE, or HAVING clauses.

-- Find employees who earn more than the company average
SELECT
    first_name,
    last_name,
    salary
FROM employees
WHERE salary > (
    SELECT AVG(salary) FROM employees
);
-- Show each employee's salary alongside the company average
SELECT
    first_name,
    last_name,
    salary,
    (SELECT AVG(salary) FROM employees) AS company_avg,
    salary - (SELECT AVG(salary) FROM employees) AS diff_from_avg
FROM employees;
Scalar subqueries must return exactly one row. If the subquery returns multiple rows, the database throws an error. Use LIMIT 1 or aggregation to guarantee a single result.

Row Subqueries

DfRow Subquery

Returns a single row with multiple columns. Used with comparisons like =, >, or < to match against multiple fields simultaneously.

-- Find employees who share the same department and salary as employee ID 101
SELECT
    first_name,
    last_name,
    department,
    salary
FROM employees
WHERE (department, salary) = (
    SELECT department, salary
    FROM employees
    WHERE employee_id = 101
);

IN Subqueries

DfIN Subquery

Tests whether a value exists in the result set returned by a subquery. Returns TRUE if the value matches any row in the subquery.

-- Find all orders from customers in New York
SELECT
    order_id,
    order_date,
    amount
FROM orders
WHERE customer_id IN (
    SELECT customer_id
    FROM customers
    WHERE city = 'New York'
);
-- Find products that have never been ordered
SELECT
    product_name,
    category
FROM products
WHERE product_id NOT IN (
    SELECT DISTINCT product_id
    FROM order_items
    WHERE product_id IS NOT NULL
);
When using NOT IN, ensure the subquery contains no NULL values. A single NULL in the result set causes NOT IN to return zero rows. Filter NULLs explicitly or use NOT EXISTS instead.

EXISTS Subqueries

DfEXISTS

Tests whether a subquery returns any rows. Returns TRUE if at least one row exists. Does not return data β€” only checks for existence.

-- Find customers who have placed at least one order
SELECT
    customer_id,
    customer_name
FROM customers c
WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
);
-- Find customers with no orders using NOT EXISTS
SELECT
    customer_id,
    customer_name
FROM customers c
WHERE NOT EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
);

IN vs EXISTS Comparison

ScenarioUse INUse EXISTS
Subquery returns small listβœ… BestWorks
Subquery returns large datasetWorksβœ… Best
Need to check for NULLs⚠️ Riskyβœ… Safe
Outer table is smallβœ… BestWorks
Subquery is complex/JOINedWorksβœ… Best

Derived Table Subqueries

DfDerived Table

A subquery in the FROM clause that produces a temporary result set. Must be aliased and can be joined like any regular table.

-- Use a derived table to compute department statistics first
SELECT
    d.dept_name,
    d_stats.avg_salary,
    d_stats.headcount
FROM departments d
JOIN (
    SELECT
        department_id,
        AVG(salary) AS avg_salary,
        COUNT(*) AS headcount
    FROM employees
    GROUP BY department_id
) d_stats ON d.dept_id = d_stats.department_id;
-- Filter on aggregated results using a derived table
SELECT *
FROM (
    SELECT
        customer_id,
        COUNT(*) AS order_count,
        SUM(amount) AS total_spent
    FROM orders
    GROUP BY customer_id
) customer_totals
WHERE total_spent > 1000
ORDER BY total_spent DESC;

Subqueries in SELECT

-- Correlated scalar subquery in SELECT clause
SELECT
    e.first_name,
    e.last_name,
    e.salary,
    e.department,
    e.salary - (
        SELECT AVG(salary)
        FROM employees e2
        WHERE e2.department = e.department
    ) AS diff_from_dept_avg
FROM employees e;
-- Count related rows with a scalar subquery
SELECT
    c.customer_id,
    c.customer_name,
    (SELECT COUNT(*) FROM orders o WHERE o.customer_id = c.customer_id) AS order_count,
    (SELECT SUM(amount) FROM orders o WHERE o.customer_id = c.customer_id) AS total_spent
FROM customers c;
Subqueries in SELECT execute once per row. For large datasets, consider using window functions or JOINs instead for better performance.

Subqueries in HAVING

-- Find departments with above-average headcount
SELECT
    department,
    COUNT(*) AS headcount
FROM employees
GROUP BY department
HAVING COUNT(*) > (
    SELECT AVG(dept_count)
    FROM (
        SELECT COUNT(*) AS dept_count
        FROM employees
        GROUP BY department
    ) dept_sizes
);
-- Find product categories where average order value exceeds the overall average
SELECT
    category,
    ROUND(AVG(amount), 2) AS avg_order_value
FROM order_items oi
JOIN products p ON oi.product_id = p.product_id
GROUP BY category
HAVING AVG(amount) > (
    SELECT AVG(amount) FROM order_items
);

Subquery vs JOIN Performance

-- Using subquery (may be slower on large tables)
SELECT *
FROM employees
WHERE department_id IN (
    SELECT department_id FROM departments WHERE location = 'NYC'
);

-- Using JOIN (often optimized better by the query planner)
SELECT e.*
FROM employees e
JOIN departments d ON e.department_id = d.department_id
WHERE d.location = 'NYC';

When to Prefer Subqueries Over JOINs

SituationSubqueryJOIN
Checking existence onlyβœ… EXISTSWorks
Avoiding duplicate rowsβœ… AutomaticMay duplicate
Readability for beginnersβœ… Clear intentCan be complex
Performance on large dataDependsβœ… Usually faster
Filtering on aggregated resultsβœ… CleanerNeeds HAVING
Most modern query optimizers automatically rewrite IN subqueries as semi-joins. Write the version that is clearest; let the optimizer handle performance.

Practical Examples

Top N Per Group

-- Find the top 3 earners in each department
SELECT
    first_name,
    last_name,
    department,
    salary
FROM employees e1
WHERE 3 > (
    SELECT COUNT(*)
    FROM employees e2
    WHERE e2.department = e1.department
      AND e2.salary > e1.salary
)
ORDER BY department, salary DESC;

Self-Referencing Subquery

-- Find employees who earn more than their manager
SELECT
    e.first_name AS employee,
    e.salary AS emp_salary,
    m.first_name AS manager,
    m.salary AS mgr_salary
FROM employees e
JOIN employees m ON e.manager_id = m.employee_id
WHERE e.salary > (
    SELECT AVG(salary)
    FROM employees
    WHERE department = e.department
);

Summary

Key Takeaways

  1. Scalar subqueries return one value β€” use them in SELECT and WHERE for comparisons.
  2. IN tests membership; EXISTS tests existence β€” prefer EXISTS with large or NULL-prone datasets.
  3. Derived tables in FROM create temporary result sets that can be joined like regular tables.
  4. Subqueries in SELECT execute per row β€” use window functions for better performance on large data.
  5. NOT IN fails silently if the subquery returns any NULL values β€” filter or use NOT EXISTS.
  6. Modern optimizers often convert subqueries to joins automatically β€” write for clarity first.
⭐

Premium Content

Subqueries

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert SQL Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement