SQL Expert

Table Partitioning

Divide massive tables into smaller, manageable partitions for faster queries and easier maintenance.

Range Partitioning — split data by date ranges or numeric intervals
List Partitioning — assign rows to partitions based on discrete values
Hash Partitioning — distribute data evenly across partitions using a hash function
Partition Pruning — let the engine skip irrelevant partitions at query time Scale to billions of rows without rewriting application queries.

What Is Table Partitioning?

DfTable Partitioning

A database design technique that splits one logical table into multiple physical storage units called partitions. Each partition holds a subset of the data and can be managed independently. The database engine transparently routes queries to the correct partitions through a process called partition pruning.

-- Range partitioning by year (PostgreSQL)
CREATE TABLE sales (
    id          BIGSERIAL,
    sale_date   DATE NOT NULL,
    customer_id INT NOT NULL,
    amount      DECIMAL(10,2),
    region      VARCHAR(50)
) PARTITION BY RANGE (sale_date);

CREATE TABLE sales_2022 PARTITION OF sales
    FOR VALUES FROM ('2022-01-01') TO ('2023-01-01');

CREATE TABLE sales_2023 PARTITION OF sales
    FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');

CREATE TABLE sales_2024 PARTITION OF sales
    FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');

Partition pruning happens automatically when your WHERE clause filters on the partition key. A query with WHERE sale_date = '2024-06-15' will only scan the sales_2024 partition.

Types of Partitioning

Strategy	How It Works	Best For
Range	Splits by continuous values (dates, numbers)	Time-series data, log tables
List	Assigns rows to partitions by explicit value lists	Multi-tenant systems, regional data
Hash	Distributes rows by hashing a column	Even distribution, no natural range key
Composite	Combines two or more strategies	Large tables needing both range and list

Range Partitioning — Deep Dive

DfRange Partitioning

Maps each row to a partition based on whether the partition key falls within a defined range. Each partition has lower and upper bounds. Rows outside all defined ranges are rejected or routed to a default partition.

-- Monthly range partitioning (PostgreSQL)
CREATE TABLE events (
    event_id    BIGSERIAL,
    event_time  TIMESTAMP NOT NULL,
    user_id     INT NOT NULL,
    event_type  VARCHAR(50),
    payload     JSONB
) PARTITION BY RANGE (event_time);

-- Create partitions for each month
CREATE TABLE events_2024_01 PARTITION OF events
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
CREATE TABLE events_2024_02 PARTITION OF events
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
CREATE TABLE events_2024_03 PARTITION OF events
    FOR VALUES FROM ('2024-03-01') TO ('2024-04-01');
-- ... repeat for each month

-- Default partition catches anything that doesn't match
CREATE TABLE events_default PARTITION OF events DEFAULT;

-- MySQL range partitioning syntax
CREATE TABLE metrics (
    id          BIGINT AUTO_INCREMENT,
    recorded_at DATETIME NOT NULL,
    metric_name VARCHAR(100),
    metric_value DOUBLE,
    PRIMARY KEY (id, recorded_at)
)
PARTITION BY RANGE (YEAR(recorded_at)) (
    PARTITION p2022 VALUES LESS THAN (2023),
    PARTITION p2023 VALUES LESS THAN (2024),
    PARTITION p2024 VALUES LESS THAN (2025),
    PARTITION pmax  VALUES LESS THAN MAXVALUE
);

In MySQL, the partition key must be part of the primary key. If you don't include it, you'll get: InnoDB does not support partitioning in place of the PRIMARY KEY.

List Partitioning

DfList Partitioning

Assigns rows to partitions based on a discrete set of values. Each partition is defined by an explicit list of allowed values for the partition key.

-- List partitioning by region
CREATE TABLE customer_data (
    id            BIGSERIAL,
    customer_name VARCHAR(100),
    region        VARCHAR(20) NOT NULL,
    revenue       DECIMAL(12,2)
) PARTITION BY LIST (region);

CREATE TABLE customer_na PARTITION OF customer_data
    FOR VALUES IN ('US', 'CA', 'MX');
CREATE TABLE customer_eu PARTITION OF customer_data
    FOR VALUES IN ('UK', 'DE', 'FR', 'ES', 'IT');
CREATE TABLE customer_apac PARTITION OF customer_data
    FOR VALUES IN ('JP', 'AU', 'KR', 'IN', 'SG');

-- Multi-column list partitioning (PostgreSQL)
CREATE TABLE orders (
    id          BIGSERIAL,
    order_date  DATE NOT NULL,
    status      VARCHAR(20) NOT NULL,
    warehouse   VARCHAR(10) NOT NULL,
    total       DECIMAL(10,2)
) PARTITION BY LIST (status);

CREATE TABLE orders_pending PARTITION OF orders
    FOR VALUES IN ('PENDING', 'PROCESSING');
CREATE TABLE orders_shipped PARTITION OF orders
    FOR VALUES IN ('SHIPPED', 'DELIVERED');
CREATE TABLE orders_cancelled PARTITION OF orders
    FOR VALUES IN ('CANCELLED', 'REFUNDED');

Hash Partitioning

DfHash Partitioning

Distributes rows across a fixed number of partitions by applying a hash function to the partition key. This ensures roughly even distribution regardless of the data's natural distribution.

-- Hash partitioning into 8 partitions (PostgreSQL)
CREATE TABLE sessions (
    session_id  UUID NOT NULL,
    user_id     INT NOT NULL,
    started_at  TIMESTAMP,
    last_active TIMESTAMP,
    ip_address  INET
) PARTITION BY HASH (session_id);

CREATE TABLE sessions_0 PARTITION OF sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 0);
CREATE TABLE sessions_1 PARTITION OF sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 1);
CREATE TABLE sessions_2 PARTITION OF sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 2);
CREATE TABLE sessions_3 PARTITION OF sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 3);
CREATE TABLE sessions_4 PARTITION OF sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 4);
CREATE TABLE sessions_5 PARTITION OF sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 5);
CREATE TABLE sessions_6 PARTITION OF sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 6);
CREATE TABLE sessions_7 PARTITION OF sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 7);

Partition Pruning

DfPartition Pruning

The optimizer's ability to eliminate partitions that cannot contain matching rows, based on conditions in the WHERE clause. This reduces I/O by scanning only relevant partitions.

-- This query prunes to only sales_2024
EXPLAIN ANALYZE
SELECT region, SUM(amount) AS total_sales
FROM sales
WHERE sale_date BETWEEN '2024-01-01' AND '2024-12-31'
GROUP BY region;

-- This query prunes across two partitions
EXPLAIN ANALYZE
SELECT *
FROM sales
WHERE sale_date >= '2023-06-01' AND sale_date < '2024-06-01';

-- This query CANNOT prune (function on partition key)
EXPLAIN ANALYZE
SELECT * FROM sales
WHERE EXTRACT(YEAR FROM sale_date) = 2024;  -- scans all partitions!

For best pruning performance, always filter directly on the partition column. Avoid wrapping it in functions like EXTRACT(), DATE_TRUNC(), or CAST() in the WHERE clause.

Managing Partitions

-- Add a new partition for the next month
CREATE TABLE sales_2025_01 PARTITION OF sales
    FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');

-- Detach an old partition (PostgreSQL 12+)
ALTER TABLE sales DETACH PARTITION sales_2022;

-- Drop a detached partition
DROP TABLE sales_2022;

-- Attach an existing table as a partition
ALTER TABLE sales ATTACH PARTITION sales_2022
    FOR VALUES FROM ('2022-01-01') TO ('2023-01-01');

-- Check partition sizes
SELECT
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) AS size
FROM pg_tables
WHERE tablename LIKE 'sales_%'
ORDER BY pg_total_relation_size(schemaname || '.' || tablename) DESC;

-- MySQL: Add and drop partitions
ALTER TABLE metrics ADD PARTITION (
    PARTITION p2025 VALUES LESS THAN (2026)
);

ALTER TABLE metrics DROP PARTITION p2022;

-- Reorganize a partition (MySQL)
ALTER TABLE metrics REORGANIZE PARTITION p2024 INTO (
    PARTITION p2024_h1 VALUES LESS THAN (2024),
    PARTITION p2024_h2 VALUES LESS THAN (2025)
);

Partitioning vs Indexing

Feature	Partitioning	Indexing
Purpose	Split table into physical pieces	Create sorted lookup structures
Granularity	Row groups (thousands to millions)	Individual rows
Pruning	Eliminates entire partitions	Seeks into B-tree or hash index
Maintenance	Drop/detach partitions for archival	Rebuild indexes periodically
Best For	Large time-series, multi-tenant	Frequent lookups on specific columns
Combined	Partition + index each partition	Best of both worlds

You can (and should) create indexes within each partition. Most databases automatically create partition-local indexes when you define them on the parent table.

When to Partition

Scenario	Partition?	Why
Table over 100GB	Yes	Pruning eliminates massive I/O
Time-series with archival	Yes	Drop old partitions instantly
Multi-tenant with data isolation	Yes	Per-tenant partition management
Table under 10GB	No	Overhead exceeds benefits
Frequent cross-partition joins	Caution	May hurt performance
Small lookup tables	No	Indexes are sufficient

Common Pitfalls

-- WRONG: Using a function on the partition key (prevents pruning)
SELECT * FROM sales WHERE YEAR(sale_date) = 2024;

-- RIGHT: Filter on the raw partition key
SELECT * FROM sales
WHERE sale_date >= '2024-01-01' AND sale_date < '2025-01-01';

-- WRONG: Forgetting the default partition
-- Rows outside defined ranges will be rejected
INSERT INTO sales (sale_date, customer_id, amount, region)
VALUES ('2026-01-15', 1, 99.99, 'US');
-- ERROR: no partition of relation "sales" found for row

-- RIGHT: Always include a default partition
CREATE TABLE sales_default PARTITION OF sales DEFAULT;

Real-World: Automated Partition Management

-- Procedure to auto-create next month's partition (PostgreSQL)
CREATE OR REPLACE PROCEDURE create_next_partition(
    p_table_name TEXT,
    p_key_column TEXT
)
LANGUAGE plpgsql
AS $$
DECLARE
    v_next_start DATE := DATE_TRUNC('month', CURRENT_DATE + INTERVAL '1 month');
    v_next_end   DATE := v_next_start + INTERVAL '1 month';
    v_part_name  TEXT := p_table_name || '_' || TO_CHAR(v_next_start, 'YYYY_MM');
BEGIN
    EXECUTE format(
        'CREATE TABLE IF NOT EXISTS %I PARTITION OF %I FOR VALUES FROM (%L) TO (%L)',
        v_part_name, p_table_name, v_next_start, v_next_end
    );
    RAISE NOTICE 'Created partition: %', v_part_name;
END;
$$;

-- Call monthly via pg_cron
SELECT cron.schedule('create-sales-partition', '0 0 1 * *',
    $$CALL create_next_partition('sales', 'sale_date')$$
);

Summary

Concept	Key Takeaway
Range	Best for time-series; partitions by value intervals
List	Best for discrete categories; partitions by explicit values
Hash	Best for even distribution; no natural range key needed
Pruning	Filter on partition key to skip irrelevant partitions
Management	Add, detach, and drop partitions independently
Indexing	Always index within partitions for best performance

Key Takeaways

Partition large tables by date or category to enable partition pruning and faster queries
Use range partitioning for time-series data and list partitioning for multi-tenant systems
Always filter directly on the partition key — functions prevent pruning
Create indexes on each partition for the best combination of pruning and seeking
Automate partition creation and archival for production systems

Table Partitioning

Table Partitioning

What Is Table Partitioning?

DfTable Partitioning

Types of Partitioning

Range Partitioning — Deep Dive

DfRange Partitioning

List Partitioning

DfList Partitioning

Hash Partitioning

DfHash Partitioning

Partition Pruning

DfPartition Pruning

Managing Partitions

Partitioning vs Indexing

When to Partition

Common Pitfalls

Real-World: Automated Partition Management

Summary

Key Takeaways

Premium Content

Need Expert SQL Help?