SQL Indexes
Speed up your queries β the right index can turn a 30-second query into a millisecond lookup.
- B-Tree Index β The default, ideal for equality and range queries
- Composite Index β Covers multiple columns for multi-condition lookups
- Unique Index β Enforces uniqueness while providing fast access An index is like a book's table of contents β it tells the database where to look without scanning every page.
What Is an Index?
DfIndex
An index is a data structure that provides fast lookups for specific columns. It creates a sorted copy of the indexed column(s) with pointers back to the original rows, allowing the database to find matching rows without scanning the entire table.
-- Create a basic index on a single column
CREATE INDEX idx_employees_last_name ON employees(last_name);
-- Query that benefits from the index
SELECT * FROM employees WHERE last_name = 'Smith';
How Indexes Work
DfB-Tree Structure
Most database indexes use a B-Tree (balanced tree) structure. Data is organized in a tree where each leaf node contains the indexed values and a pointer to the row. The database can traverse from root to leaf in O(log n) time instead of scanning every row.
Without an index, the database performs a full table scan β reading every row to find matches. With an index, it traverses the tree directly to the matching rows.
| Operation | Without Index | With Index |
|---|---|---|
| Find one row by ID | O(n) β scan all rows | O(log n) β tree traversal |
| Range query (BETWEEN) | O(n) β scan all rows | O(log n) β find range start, then scan leaves |
| COUNT with WHERE | O(n) β scan all rows | O(log n) if index covers the filter |
| INSERT | O(1) β append row | O(log n) β update the tree |
| UPDATE indexed column | O(1) β update row | O(log n) β rebuild tree node |
Types of Indexes
Single-Column Index
CREATE INDEX idx_employees_email ON employees(email);
Composite Index (Multi-Column)
-- Order matters: leftmost prefix rule
CREATE INDEX idx_employees_dept_salary ON employees(department_id, salary);
| Query | Uses Index? | Why |
|---|---|---|
WHERE department_id = 5 | β Yes | Matches leftmost column |
WHERE department_id = 5 AND salary > 70000 | β Yes | Matches both columns |
WHERE salary > 70000 | β No | Skips leftmost column |
WHERE department_id = 5 OR salary > 70000 | β οΈ Partial | OR may prevent index use |
Unique Index
-- Enforces uniqueness and provides fast lookups
CREATE UNIQUE INDEX idx_employees_email ON employees(email);
Partial Index (PostgreSQL)
-- Index only active employees
CREATE INDEX idx_active_employees ON employees(last_name)
WHERE status = 'Active';
Covering Index
-- Include columns needed by the query to avoid table lookups
CREATE INDEX idx_emp_covering ON employees(department_id, salary, first_name, last_name);
-- This query is fully served by the index β no table access needed
SELECT first_name, last_name
FROM employees
WHERE department_id = 5 AND salary > 70000;
Expression Index
-- Index on a function or expression
CREATE INDEX idx_emp_lower_email ON employees(LOWER(email));
-- Uses the expression index
SELECT * FROM employees WHERE LOWER(email) = 'john@example.com';
Creating and Managing Indexes
-- Create index
CREATE INDEX idx_orders_customer ON orders(customer_id);
-- Create index only if it doesn't exist (PostgreSQL)
CREATE INDEX IF NOT EXISTS idx_orders_customer ON orders(customer_id);
-- Drop an index
DROP INDEX idx_orders_customer;
-- Rename an index (PostgreSQL)
ALTER INDEX idx_orders_customer RENAME TO idx_cust_orders;
-- List all indexes on a table (PostgreSQL)
SELECT
indexname,
indexdef
FROM pg_indexes
WHERE tablename = 'employees';
-- List indexes (MySQL)
SHOW INDEX FROM employees;
When to Create an Index
The rule of thumb: create an index on columns that appear in WHERE, JOIN, and ORDER BY clauses β but only for tables with significant row counts and frequent query patterns.
| Scenario | Create Index? | Why |
|---|---|---|
| Column in WHERE clause (equality) | β Yes | Fast lookup for exact matches |
| Column in WHERE clause (range) | β Yes | B-Tree supports range scans efficiently |
| Column in JOIN (foreign key) | β Yes | Speeds up join operations |
| Column in ORDER BY | β Yes | Avoids costly sort operations |
| Low-cardinality column (e.g., gender) | β οΈ Maybe | Index may not help if it returns >20% of rows |
| Small table (<1000 rows) | β Usually no | Full scan is fast enough |
| Column rarely used in queries | β No | Wasted space and write overhead |
Index Impact on Performance
DfIndex Overhead
Indexes speed up reads but slow down writes. Every INSERT, UPDATE, and DELETE on an indexed column must also update the index structure. Over-indexing degrades write performance and consumes disk space.
| Operation | Impact of Index |
|---|---|
| SELECT with WHERE on indexed column | β Dramatically faster |
| SELECT with JOIN on indexed column | β Significantly faster |
| SELECT with ORDER BY on indexed column | β Avoids sorting step |
| INSERT on indexed table | β οΈ Slightly slower (index update) |
| UPDATE on indexed column | β οΈ Slower (index rebuild) |
| DELETE on indexed table | β οΈ Slightly slower (index cleanup) |
| Disk space | β οΈ Extra storage per index |
Index Monitoring
-- PostgreSQL: Check index usage statistics
SELECT
schemaname,
tablename,
indexname,
idx_scan AS times_used,
idx_tup_read AS rows_read,
idx_tup_fetch AS rows_fetched,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE tablename = 'employees'
ORDER BY idx_scan DESC;
-- PostgreSQL: Find unused indexes
SELECT
indexname,
idx_scan,
pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;
-- Check if an index is being used (PostgreSQL)
EXPLAIN ANALYZE
SELECT * FROM employees WHERE last_name = 'Smith';
-- Look for "Index Scan" vs "Seq Scan" in the output
Common Index Patterns
| Pattern | Syntax | Use Case |
|---|---|---|
| Single column | CREATE INDEX idx ON t(col) | Equality lookups |
| Composite | CREATE INDEX idx ON t(col1, col2) | Multi-column WHERE |
| Unique | CREATE UNIQUE INDEX idx ON t(col) | Enforce uniqueness |
| Partial | CREATE INDEX idx ON t(col) WHERE condition | Filtered lookups |
| Covering | CREATE INDEX idx ON t(c1, c2) INCLUDE (c3, c4) | Avoid table lookups |
| Expression | CREATE INDEX idx ON t(LOWER(col)) | Function-based queries |
| Descending | CREATE INDEX idx ON t(col DESC) | Reverse-order sorts |
Never over-index. Each index consumes disk space and slows down writes. Profile your actual query patterns with EXPLAIN ANALYZE before adding indexes.
Database-Specific Notes
| Feature | PostgreSQL | MySQL (InnoDB) | SQL Server | Oracle |
|---|---|---|---|---|
| B-Tree index | β Default | β Default | β Default | β Default |
| Unique index | β | β | β | β |
| Partial index | β | β | β (filtered) | β |
| Expression index | β | β (8.0+) | β | β |
| Covering index (INCLUDE) | β | β (covering key) | β | β |
| GiST / GIN (full-text) | β | Full-text index | β | β |
| Hash index | β | β (Memory only) | β | β |
| Reverse key index | β | β | β | β |
Key Takeaways
- An index creates a sorted data structure on a column for fast O(log n) lookups instead of full table scans
- Composite indexes follow the leftmost prefix rule β the first column is always used
- Indexes speed up reads but slow down every write β balance is critical
- Use EXPLAIN ANALYZE to verify the database is actually using your indexes
- Drop unused indexes β they waste disk space and degrade write performance