πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

GROUP BY Clause

SQL AggregationGrouping🟒 Free Lesson

Advertisement

SQL Aggregation

GROUP BY Clause

Collapse thousands of rows into meaningful summaries with grouping and aggregation.

  • Grouping β€” bucket rows by one or more columns
  • Aggregation β€” compute COUNT, SUM, AVG, MIN, MAX per group
  • Filtering Groups β€” use HAVING to filter after aggregation Turn detail into insight.

What Is GROUP BY?

DfGROUP BY

The GROUP BY clause divides a result set into groups based on the values in one or more columns. Aggregate functions (COUNT, SUM, AVG, MIN, MAX) are then applied to each group, producing one output row per group. It is the foundation of data aggregation in SQL.

GROUP BY PipelineRaw DataEng | AliceSales | BobEng | CarolSales | DanEng | EveGROUP BY deptEngEngEngSalesSalesAggregateCOUNT(*):3, 2SUM(salary):210k,120kAVG(salary):70k, 60kResultEng | 3 | 210kSales | 2 | 120k
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department;

Any column in the SELECT list that is not wrapped in an aggregate function must appear in the GROUP BY clause. Most databases enforce this rule strictly.

Syntax

SELECT
    column1,
    aggregate_function(column2)
FROM table_name
GROUP BY column1;
ElementPurpose
GROUP BYDefines the grouping column(s)
SELECTColumns to display β€” must be aggregated or in GROUP BY
Aggregate functionCOUNT, SUM, AVG, MIN, MAX, etc.

Basic Example

-- Count employees and average salary per department
SELECT
    department,
    COUNT(*) AS employee_count,
    ROUND(AVG(salary), 2) AS avg_salary
FROM employees
GROUP BY department
ORDER BY avg_salary DESC;

Single Column Grouping

-- Total salary expenditure per department
SELECT
    department,
    COUNT(*) AS headcount,
    SUM(salary) AS total_salary,
    MIN(salary) AS min_salary,
    MAX(salary) AS max_salary
FROM employees
GROUP BY department;
DepartmentHeadcountTotal SalaryMinMax
Engineering454,500,000∣4,500,000 |60,000$150,000
Marketing201,600,000∣1,600,000 |45,000$110,000
Sales352,450,000∣2,450,000 |40,000$120,000

Multiple Column Grouping

-- Breakdown by department and job title
SELECT
    department,
    job_title,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary
FROM employees
GROUP BY department, job_title
ORDER BY department, avg_salary DESC;

-- Monthly revenue by product
SELECT
    YEAR(order_date) AS order_year,
    MONTH(order_date) AS order_month,
    product_id,
    COUNT(*) AS order_count,
    SUM(total_amount) AS total_revenue
FROM orders
GROUP BY YEAR(order_date), MONTH(order_date), product_id
ORDER BY order_year, order_month, total_revenue DESC;

WHERE vs HAVING

ClauseFiltersTimingCan Use Aggregates
WHEREIndividual rowsBefore GROUP BYNo
HAVINGGroupsAfter GROUP BYYes

Query Execution Order

-- WHERE filters rows first, then GROUP BY groups, then HAVING filters groups
SELECT
    department,
    COUNT(*) AS emp_count,
    AVG(salary) AS avg_salary
FROM employees
WHERE status = 'Active'            -- Step 1: filter rows
GROUP BY department                 -- Step 2: group remaining rows
HAVING AVG(salary) > 75000         -- Step 3: filter groups
ORDER BY avg_salary DESC;          -- Step 4: sort

Examples

-- WHERE: filter before grouping (only active employees)
SELECT department, COUNT(*) AS emp_count
FROM employees
WHERE status = 'Active'
GROUP BY department;

-- HAVING: filter after grouping (departments with more than 5 people)
SELECT department, COUNT(*) AS emp_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 5;

-- Combined: active employees, grouped, then filter high-paying departments
SELECT
    department,
    COUNT(*) AS emp_count,
    AVG(salary) AS avg_salary
FROM employees
WHERE status = 'Active'
GROUP BY department
HAVING AVG(salary) > 75000
ORDER BY avg_salary DESC;

You cannot use aggregate functions in a WHERE clause. WHERE COUNT(*) > 5 is invalid β€” use HAVING instead. Conversely, you cannot use column aliases in HAVING β€” repeat the full expression.

GROUP BY with NULLs

NULL values form their own group in GROUP BY.

-- NULLs are grouped together
SELECT
    department_id,
    COUNT(*) AS employee_count
FROM employees
GROUP BY department_id
ORDER BY department_id;

-- Replace NULLs with a default value for cleaner output
SELECT
    COALESCE(department_id, 0) AS department_id,
    COUNT(*) AS employee_count
FROM employees
GROUP BY COALESCE(department_id, 0);

GROUP BY with ROLLUP and CUBE

-- ROLLUP: adds subtotals and a grand total
SELECT
    department,
    job_title,
    COUNT(*) AS employee_count
FROM employees
GROUP BY ROLLUP (department, job_title);

-- CUBE: all possible subtotals (not supported in MySQL)
SELECT
    department,
    job_title,
    COUNT(*) AS employee_count
FROM employees
GROUP BY CUBE (department, job_title);
ExtensionSubtotals Generated
GROUP BYIndividual groups only
ROLLUPHierarchical subtotals + grand total
CUBEAll possible subtotal combinations + grand total

GROUP BY with JOINs

-- Aggregate order data joined with customer info
SELECT
    c.first_name,
    c.last_name,
    COUNT(o.order_id) AS order_count,
    SUM(o.total_amount) AS total_spent
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name
HAVING COUNT(o.order_id) > 5
ORDER BY total_spent DESC;

Performance Considerations

FactorRecommendation
IndexesCreate indexes on GROUP BY columns
SortingGROUP BY may require implicit sorting
MemoryLarge groups can exceed memory limits
StatisticsKeep table statistics current
Execution planUse EXPLAIN to verify index usage
-- Index recommendations for common GROUP BY patterns
CREATE INDEX idx_employees_department ON employees(department);
CREATE INDEX idx_employees_dept_job ON employees(department, job_title);
CREATE INDEX idx_orders_date ON orders(order_date);
CREATE INDEX idx_orders_product_date ON orders(product_id, order_date);

Practice Exercises

Exercise 1: Count the number of employees in each department.

-- Solution
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department
ORDER BY employee_count DESC;

Exercise 2: Find the average salary per department for departments with more than 10 employees.

-- Solution
SELECT
    department,
    COUNT(*) AS employee_count,
    ROUND(AVG(salary), 2) AS avg_salary
FROM employees
GROUP BY department
HAVING COUNT(*) > 10
ORDER BY avg_salary DESC;

Exercise 3: Generate a monthly sales report for 2024 with order count, total revenue, and average order value.

-- Solution
SELECT
    MONTH(order_date) AS month_number,
    MONTHNAME(order_date) AS month_name,
    COUNT(*) AS order_count,
    SUM(total_amount) AS total_revenue,
    ROUND(AVG(total_amount), 2) AS avg_order_value
FROM orders
WHERE YEAR(order_date) = 2024
GROUP BY MONTH(order_date), MONTHNAME(order_date)
ORDER BY month_number;

Key Takeaways

  1. GROUP BY collapses rows into groups based on column values.
  2. Aggregate functions (COUNT, SUM, AVG, MIN, MAX) operate on each group.
  3. WHERE filters individual rows before grouping; HAVING filters groups after aggregation.
  4. Any non-aggregated column in SELECT must appear in GROUP BY.
  5. NULL values form their own group automatically.
  6. Use ROLLUP or CUBE for subtotals and grand totals.
  7. Always index GROUP BY columns for optimal performance.

See Also

⭐

Premium Content

GROUP BY Clause

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert SQL Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement