πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Table Partitioning

SQL ExpertPerformance & Scalability🟒 Free Lesson

Advertisement

SQL Expert

Table Partitioning

Divide massive tables into smaller, manageable partitions for faster queries and easier maintenance.

  • Range Partitioning β€” split data by date ranges or numeric intervals
  • List Partitioning β€” assign rows to partitions based on discrete values
  • Hash Partitioning β€” distribute data evenly across partitions using a hash function
  • Partition Pruning β€” let the engine skip irrelevant partitions at query time Scale to billions of rows without rewriting application queries.

What Is Table Partitioning?

DfTable Partitioning

A database design technique that splits one logical table into multiple physical storage units called partitions. Each partition holds a subset of the data and can be managed independently. The database engine transparently routes queries to the correct partitions through a process called partition pruning.

-- Range partitioning by year (PostgreSQL)
CREATE TABLE sales (
    id          BIGSERIAL,
    sale_date   DATE NOT NULL,
    customer_id INT NOT NULL,
    amount      DECIMAL(10,2),
    region      VARCHAR(50)
) PARTITION BY RANGE (sale_date);

CREATE TABLE sales_2022 PARTITION OF sales
    FOR VALUES FROM ('2022-01-01') TO ('2023-01-01');

CREATE TABLE sales_2023 PARTITION OF sales
    FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');

CREATE TABLE sales_2024 PARTITION OF sales
    FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');

Partition pruning happens automatically when your WHERE clause filters on the partition key. A query with WHERE sale_date = '2024-06-15' will only scan the sales_2024 partition.

Types of Partitioning

StrategyHow It WorksBest For
RangeSplits by continuous values (dates, numbers)Time-series data, log tables
ListAssigns rows to partitions by explicit value listsMulti-tenant systems, regional data
HashDistributes rows by hashing a columnEven distribution, no natural range key
CompositeCombines two or more strategiesLarge tables needing both range and list

Range Partitioning β€” Deep Dive

DfRange Partitioning

Maps each row to a partition based on whether the partition key falls within a defined range. Each partition has lower and upper bounds. Rows outside all defined ranges are rejected or routed to a default partition.

-- Monthly range partitioning (PostgreSQL)
CREATE TABLE events (
    event_id    BIGSERIAL,
    event_time  TIMESTAMP NOT NULL,
    user_id     INT NOT NULL,
    event_type  VARCHAR(50),
    payload     JSONB
) PARTITION BY RANGE (event_time);

-- Create partitions for each month
CREATE TABLE events_2024_01 PARTITION OF events
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
CREATE TABLE events_2024_02 PARTITION OF events
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
CREATE TABLE events_2024_03 PARTITION OF events
    FOR VALUES FROM ('2024-03-01') TO ('2024-04-01');
-- ... repeat for each month

-- Default partition catches anything that doesn't match
CREATE TABLE events_default PARTITION OF events DEFAULT;
-- MySQL range partitioning syntax
CREATE TABLE metrics (
    id          BIGINT AUTO_INCREMENT,
    recorded_at DATETIME NOT NULL,
    metric_name VARCHAR(100),
    metric_value DOUBLE,
    PRIMARY KEY (id, recorded_at)
)
PARTITION BY RANGE (YEAR(recorded_at)) (
    PARTITION p2022 VALUES LESS THAN (2023),
    PARTITION p2023 VALUES LESS THAN (2024),
    PARTITION p2024 VALUES LESS THAN (2025),
    PARTITION pmax  VALUES LESS THAN MAXVALUE
);

In MySQL, the partition key must be part of the primary key. If you don't include it, you'll get: InnoDB does not support partitioning in place of the PRIMARY KEY.

List Partitioning

DfList Partitioning

Assigns rows to partitions based on a discrete set of values. Each partition is defined by an explicit list of allowed values for the partition key.

-- List partitioning by region
CREATE TABLE customer_data (
    id            BIGSERIAL,
    customer_name VARCHAR(100),
    region        VARCHAR(20) NOT NULL,
    revenue       DECIMAL(12,2)
) PARTITION BY LIST (region);

CREATE TABLE customer_na PARTITION OF customer_data
    FOR VALUES IN ('US', 'CA', 'MX');
CREATE TABLE customer_eu PARTITION OF customer_data
    FOR VALUES IN ('UK', 'DE', 'FR', 'ES', 'IT');
CREATE TABLE customer_apac PARTITION OF customer_data
    FOR VALUES IN ('JP', 'AU', 'KR', 'IN', 'SG');
-- Multi-column list partitioning (PostgreSQL)
CREATE TABLE orders (
    id          BIGSERIAL,
    order_date  DATE NOT NULL,
    status      VARCHAR(20) NOT NULL,
    warehouse   VARCHAR(10) NOT NULL,
    total       DECIMAL(10,2)
) PARTITION BY LIST (status);

CREATE TABLE orders_pending PARTITION OF orders
    FOR VALUES IN ('PENDING', 'PROCESSING');
CREATE TABLE orders_shipped PARTITION OF orders
    FOR VALUES IN ('SHIPPED', 'DELIVERED');
CREATE TABLE orders_cancelled PARTITION OF orders
    FOR VALUES IN ('CANCELLED', 'REFUNDED');

Hash Partitioning

DfHash Partitioning

Distributes rows across a fixed number of partitions by applying a hash function to the partition key. This ensures roughly even distribution regardless of the data's natural distribution.

-- Hash partitioning into 8 partitions (PostgreSQL)
CREATE TABLE sessions (
    session_id  UUID NOT NULL,
    user_id     INT NOT NULL,
    started_at  TIMESTAMP,
    last_active TIMESTAMP,
    ip_address  INET
) PARTITION BY HASH (session_id);

CREATE TABLE sessions_0 PARTITION OF sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 0);
CREATE TABLE sessions_1 PARTITION OF sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 1);
CREATE TABLE sessions_2 PARTITION OF sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 2);
CREATE TABLE sessions_3 PARTITION OF sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 3);
CREATE TABLE sessions_4 PARTITION OF sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 4);
CREATE TABLE sessions_5 PARTITION OF sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 5);
CREATE TABLE sessions_6 PARTITION OF sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 6);
CREATE TABLE sessions_7 PARTITION OF sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 7);

Partition Pruning

DfPartition Pruning

The optimizer's ability to eliminate partitions that cannot contain matching rows, based on conditions in the WHERE clause. This reduces I/O by scanning only relevant partitions.

-- This query prunes to only sales_2024
EXPLAIN ANALYZE
SELECT region, SUM(amount) AS total_sales
FROM sales
WHERE sale_date BETWEEN '2024-01-01' AND '2024-12-31'
GROUP BY region;

-- This query prunes across two partitions
EXPLAIN ANALYZE
SELECT *
FROM sales
WHERE sale_date >= '2023-06-01' AND sale_date < '2024-06-01';

-- This query CANNOT prune (function on partition key)
EXPLAIN ANALYZE
SELECT * FROM sales
WHERE EXTRACT(YEAR FROM sale_date) = 2024;  -- scans all partitions!

For best pruning performance, always filter directly on the partition column. Avoid wrapping it in functions like EXTRACT(), DATE_TRUNC(), or CAST() in the WHERE clause.

Managing Partitions

-- Add a new partition for the next month
CREATE TABLE sales_2025_01 PARTITION OF sales
    FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');

-- Detach an old partition (PostgreSQL 12+)
ALTER TABLE sales DETACH PARTITION sales_2022;

-- Drop a detached partition
DROP TABLE sales_2022;

-- Attach an existing table as a partition
ALTER TABLE sales ATTACH PARTITION sales_2022
    FOR VALUES FROM ('2022-01-01') TO ('2023-01-01');

-- Check partition sizes
SELECT
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) AS size
FROM pg_tables
WHERE tablename LIKE 'sales_%'
ORDER BY pg_total_relation_size(schemaname || '.' || tablename) DESC;
-- MySQL: Add and drop partitions
ALTER TABLE metrics ADD PARTITION (
    PARTITION p2025 VALUES LESS THAN (2026)
);

ALTER TABLE metrics DROP PARTITION p2022;

-- Reorganize a partition (MySQL)
ALTER TABLE metrics REORGANIZE PARTITION p2024 INTO (
    PARTITION p2024_h1 VALUES LESS THAN (2024),
    PARTITION p2024_h2 VALUES LESS THAN (2025)
);

Partitioning vs Indexing

FeaturePartitioningIndexing
PurposeSplit table into physical piecesCreate sorted lookup structures
GranularityRow groups (thousands to millions)Individual rows
PruningEliminates entire partitionsSeeks into B-tree or hash index
MaintenanceDrop/detach partitions for archivalRebuild indexes periodically
Best ForLarge time-series, multi-tenantFrequent lookups on specific columns
CombinedPartition + index each partitionBest of both worlds

You can (and should) create indexes within each partition. Most databases automatically create partition-local indexes when you define them on the parent table.

When to Partition

ScenarioPartition?Why
Table over 100GBYesPruning eliminates massive I/O
Time-series with archivalYesDrop old partitions instantly
Multi-tenant with data isolationYesPer-tenant partition management
Table under 10GBNoOverhead exceeds benefits
Frequent cross-partition joinsCautionMay hurt performance
Small lookup tablesNoIndexes are sufficient

Common Pitfalls

-- WRONG: Using a function on the partition key (prevents pruning)
SELECT * FROM sales WHERE YEAR(sale_date) = 2024;

-- RIGHT: Filter on the raw partition key
SELECT * FROM sales
WHERE sale_date >= '2024-01-01' AND sale_date < '2025-01-01';
-- WRONG: Forgetting the default partition
-- Rows outside defined ranges will be rejected
INSERT INTO sales (sale_date, customer_id, amount, region)
VALUES ('2026-01-15', 1, 99.99, 'US');
-- ERROR: no partition of relation "sales" found for row

-- RIGHT: Always include a default partition
CREATE TABLE sales_default PARTITION OF sales DEFAULT;

Real-World: Automated Partition Management

-- Procedure to auto-create next month's partition (PostgreSQL)
CREATE OR REPLACE PROCEDURE create_next_partition(
    p_table_name TEXT,
    p_key_column TEXT
)
LANGUAGE plpgsql
AS $$
DECLARE
    v_next_start DATE := DATE_TRUNC('month', CURRENT_DATE + INTERVAL '1 month');
    v_next_end   DATE := v_next_start + INTERVAL '1 month';
    v_part_name  TEXT := p_table_name || '_' || TO_CHAR(v_next_start, 'YYYY_MM');
BEGIN
    EXECUTE format(
        'CREATE TABLE IF NOT EXISTS %I PARTITION OF %I FOR VALUES FROM (%L) TO (%L)',
        v_part_name, p_table_name, v_next_start, v_next_end
    );
    RAISE NOTICE 'Created partition: %', v_part_name;
END;
$$;

-- Call monthly via pg_cron
SELECT cron.schedule('create-sales-partition', '0 0 1 * *',
    $$CALL create_next_partition('sales', 'sale_date')$$
);

Summary

ConceptKey Takeaway
RangeBest for time-series; partitions by value intervals
ListBest for discrete categories; partitions by explicit values
HashBest for even distribution; no natural range key needed
PruningFilter on partition key to skip irrelevant partitions
ManagementAdd, detach, and drop partitions independently
IndexingAlways index within partitions for best performance

Key Takeaways

  1. Partition large tables by date or category to enable partition pruning and faster queries
  2. Use range partitioning for time-series data and list partitioning for multi-tenant systems
  3. Always filter directly on the partition key β€” functions prevent pruning
  4. Create indexes on each partition for the best combination of pruning and seeking
  5. Automate partition creation and archival for production systems
⭐

Premium Content

Table Partitioning

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert SQL Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement