Speed up your slow queries with execution plans, indexing strategies, and proven optimization techniques.
Execution Plans β Understand exactly how the database processes each query
Indexing Strategies β Create the right indexes for your workload
Query Rewriting β Transform slow queries into fast ones with simple techniques
Performance is not an afterthought β it's a design decision.
Why Query Performance Matters
DfQuery Performance
Query performance measures how quickly a database retrieves or modifies data. It's influenced by schema design, indexing, query structure, data volume, and hardware resources. Poor performance leads to slow applications, frustrated users, and increased infrastructure costs.
A query that takes 100ms with 1,000 rows might take 10+ seconds with 1,000,000 rows if not properly optimized. Performance tuning is about making queries scale.
Performance Factor
Impact
Difficulty to Fix
Missing indexes
Very High
Low
Poor query structure
High
Medium
Bad schema design
High
High
Insufficient hardware
Medium
Low (just upgrade)
Lock contention
High
High
Outdated statistics
Medium
Low
The Performance Tuning Process
-- Step 1: Identify slow queries
-- PostgreSQL: Enable slow query logging
ALTER SYSTEM SET log_min_duration_statement = 1000; -- Log queries > 1 second
SELECT pg_reload_conf();
-- Step 2: Measure baseline performance
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT c.name, COUNT(o.order_id) AS order_count
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
WHERE o.order_date >= '2024-01-01'
GROUP BY c.id, c.name
ORDER BY order_count DESC
LIMIT 10;
-- Step 3: Create appropriate index
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date);
-- Step 4: Re-measure and compare
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT c.name, COUNT(o.order_id) AS order_count
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
WHERE o.order_date >= '2024-01-01'
GROUP BY c.id, c.name
ORDER BY order_count DESC
LIMIT 10;
Always measure before and after optimization. Without baseline numbers, you can't prove your changes actually improved performance.
Indexing Strategies
How Indexes Work
DfIndex
An index is a data structure that provides fast lookup of rows matching specific column values, similar to a book's index. Without an index, the database must scan every row (full table scan) to find matching records.
-- Without index: Full table scan (checks every row)
-- With index: Index scan (follows B-tree structure)
-- B-tree index (default, most common)
CREATE INDEX idx_customers_email ON customers(email);
SELECT * FROM customers WHERE email = 'alice@example.com';
-- Speed: O(log n) instead of O(n)
-- Composite index (multiple columns)
CREATE INDEX idx_orders_status_date ON orders(status, order_date);
SELECT * FROM orders WHERE status = 'shipped' AND order_date > '2024-01-01';
-- Uses both columns in the index
-- Partial index (index only subset of rows)
CREATE INDEX idx_orders_pending ON orders(order_date)
WHERE status = 'pending';
-- Much smaller index, faster for pending order queries
-- Covering index (includes all needed columns)
CREATE INDEX idx_orders_covering ON orders(customer_id, order_date)
INCLUDE (total, status);
-- Query answered entirely from index, no table access needed
Index Types Comparison
Index Type
Best For
PostgreSQL Syntax
B-tree
Equality, range queries
CREATE INDEX ... ON table(col)
Hash
Equality only (fastest)
CREATE INDEX ... ON table USING hash(col)
GIN
Full-text search, arrays
CREATE INDEX ... ON table USING gin(col)
GiST
Geometric,ε ¨ζζη΄’
CREATE INDEX ... ON table USING gist(col)
BRIN
Large, naturally ordered tables
CREATE INDEX ... ON table USING brin(col)
Partial
Filtered queries
CREATE INDEX ... ON table(col) WHERE condition
When NOT to Create Indexes
Indexes are not free. Every index slows down INSERT, UPDATE, and DELETE operations because the index must be maintained. Don't index columns that are rarely queried.
Skip Indexing On
Reason
Small tables (< 10K rows)
Full scan is fast enough
Low-cardinality columns (status, boolean)
Index not selective enough
Columns rarely used in WHERE/JOIN
Maintenance cost outweighs benefit
Frequently updated columns
Index maintenance slows writes
Wide text columns without prefix
Index size becomes unwieldy
Common Query Anti-Patterns
Anti-Pattern 1: SELECT *
-- BAD: Fetches all columns, even those not needed
SELECT * FROM customers
INNER JOIN orders ON customers.id = orders.customer_id
WHERE orders.total > 100;
-- GOOD: Only fetch what you need
SELECT
c.name,
c.email,
o.order_date,
o.total
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
WHERE o.total > 100;
Anti-Pattern 2: Functions on Indexed Columns
-- BAD: Function prevents index usage (full table scan)
SELECT * FROM orders
WHERE YEAR(order_date) = 2024;
-- GOOD: Rewrite to allow index usage
SELECT * FROM orders
WHERE order_date >= '2024-01-01'
AND order_date < '2025-01-01';
-- BAD: Lowercase comparison prevents index usage
SELECT * FROM customers
WHERE LOWER(email) = 'alice@example.com';
-- GOOD: Create functional index
CREATE INDEX idx_customers_email_lower ON customers(LOWER(email));
SELECT * FROM customers
WHERE LOWER(email) = 'alice@example.com';
Anti-Pattern 3: NOT IN with Subquery
-- BAD: NOT IN with subquery (slow, NULL-handling issues)
SELECT * FROM products
WHERE product_id NOT IN (SELECT product_id FROM order_items);
-- GOOD: Use NOT EXISTS (better performance)
SELECT * FROM products p
WHERE NOT EXISTS (
SELECT 1 FROM order_items oi WHERE oi.product_id = p.product_id
);
-- GOOD: Use LEFT JOIN with NULL check
SELECT p.*
FROM products p
LEFT JOIN order_items oi ON p.product_id = oi.product_id
WHERE oi.product_id IS NULL;
Anti-Pattern 4: Implicit JOIN
-- BAD: Implicit join (old syntax, hard to read)
SELECT c.name, o.total
FROM customers c, orders o
WHERE c.id = o.customer_id
AND o.total > 100;
-- GOOD: Explicit JOIN (clear intent)
SELECT c.name, o.total
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
WHERE o.total > 100;
JOIN Optimization
DfJOIN Performance
JOIN performance depends on the join algorithm used: Nested Loop (small tables), Hash Join (medium tables), or Merge Join (pre-sorted data). The database optimizer chooses the algorithm, but you can influence it through indexes and statistics.
-- Ensure JOIN columns are indexed
CREATE INDEX idx_orders_customer ON orders(customer_id);
CREATE INDEX idx_order_items_order ON order_items(order_id);
-- Use appropriate JOIN type
-- BAD: CROSS JOIN then filter (produces all combinations first)
SELECT c.name, p.name
FROM customers c, products p, orders o
WHERE c.id = o.customer_id AND o.product_id = p.product_id;
-- GOOD: Explicit INNER JOIN
SELECT c.name, p.name
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
INNER JOIN products p ON o.product_id = p.product_id;
-- Filter early to reduce JOIN dataset
SELECT c.name, o.total
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
WHERE o.order_date >= '2024-01-01' -- Filter before or after?
-- Filter in subquery for complex scenarios
SELECT c.name, agg.total_spent
FROM customers c
INNER JOIN (
SELECT customer_id, SUM(total) AS total_spent
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY customer_id
) agg ON c.id = agg.customer_id;
For large JOINs, consider materialized views or temporary tables to pre-compute intermediate results. This trades storage for speed.
Subquery vs JOIN Performance
Approach
When It's Faster
When It's Slower
JOIN
Need data from both tables
Duplicate rows from 1:N joins
EXISTS
Just checking existence
When full data needed
IN
Small subquery result set
Large subquery result set
LATERAL JOIN
Per-row subquery computation
Correlated subqueries
CTE
Readability, complex logic
Simple queries (CTE overhead)
-- Subquery vs JOIN: Find customers who placed orders in 2024
-- Method 1: Subquery with IN (good for small result sets)
SELECT * FROM customers
WHERE customer_id IN (
SELECT customer_id FROM orders
WHERE order_date >= '2024-01-01'
);
-- Method 2: JOIN (good when you need order data too)
SELECT DISTINCT c.*
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_date >= '2024-01-01';
-- Method 3: EXISTS (best for existence check)
SELECT * FROM customers c
WHERE EXISTS (
SELECT 1 FROM orders o
WHERE o.customer_id = c.customer_id
AND o.order_date >= '2024-01-01'
);
Pagination Performance
-- BAD: OFFSET-based pagination (slow for large offsets)
SELECT * FROM products
ORDER BY product_id
LIMIT 20 OFFSET 10000; -- Database reads 10020 rows, discards 10000
-- GOOD: Keyset pagination (constant time)
SELECT * FROM products
WHERE product_id > 10000 -- Last seen ID from previous page
ORDER BY product_id
LIMIT 20;
-- BAD: Using LIMIT without ORDER BY (non-deterministic)
SELECT * FROM orders LIMIT 10;
-- GOOD: Deterministic ordering
SELECT * FROM orders
ORDER BY order_id
LIMIT 10;
OFFSET-based pagination becomes exponentially slower as the offset increases. For tables with millions of rows, always use keyset pagination.
Performance Benchmarks
-- Benchmarking with pgbench-style queries
-- Test 1: Simple primary key lookup (should be < 1ms)
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM customers WHERE customer_id = 12345;
-- Test 2: Indexed range query (should be < 10ms)
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders
WHERE order_date BETWEEN '2024-01-01' AND '2024-12-31';
-- Test 3: Aggregation with GROUP BY (should be < 100ms)
EXPLAIN (ANALYZE, BUFFERS)
SELECT
DATE_TRUNC('month', order_date) AS month,
COUNT(*) AS order_count,
SUM(total) AS revenue
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY DATE_TRUNC('month', order_date)
ORDER BY month;
-- Test 4: Complex JOIN (should be < 500ms)
EXPLAIN (ANALYZE, BUFFERS)
SELECT
c.name,
c.email,
COUNT(o.order_id) AS total_orders,
SUM(o.total) AS lifetime_value
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE c.created_at >= '2023-01-01'
GROUP BY c.customer_id, c.name, c.email
HAVING COUNT(o.order_id) > 5
ORDER BY lifetime_value DESC
LIMIT 20;
Query Type
Target Time
Optimization Priority
PK lookup
< 1ms
Index exists?
Indexed range
< 10ms
Composite index?
Aggregation
< 100ms
GROUP BY indexed?
Complex JOIN
< 500ms
Join indexes?
Full table scan
Avoid
Add index
Set performance budgets for your critical queries. If a query exceeds its budget, it becomes a bug that must be fixed before deployment.
Key Takeaways
Measure before optimizing β Use EXPLAIN ANALYZE to get actual execution times
Index strategically β Index WHERE, JOIN, and ORDER BY columns, not everything
**Avoid SELECT *** β Fetch only the columns you need
Rewrite anti-patterns β Functions on columns, NOT IN, OFFSET pagination
Test with realistic data β 100 rows and 10M rows behave very differently
Performance Quick Reference
Problem
Symptom
Solution
Missing index
Sequential scan on large table
CREATE INDEX on filtered/joined columns
Wrong index type
Index not used
Check column usage patterns
Lock timeout
Queries blocking each other
Reduce transaction duration
Memory pressure
Disk-based sorts/hashes
Increase work_mem, add indexes
Statistics stale
Bad query plans
ANALYZE table
N+1 queries
Many small queries in loop
Use JOIN or batch loading
β
Premium Content
Query Performance
Unlock this lesson and 900+ advanced tutorials with a Premium plan.