Interview Question: "Explain the difference between READ COMMITTED and REPEATABLE READ. What anomalies can occur at each isolation level? How does MVCC prevent dirty reads?" — Asked at JPMorgan, Goldman Sachs, Bloomberg for Senior Database roles
ℹ️
Difficulty: Advanced | Companies: JPMorgan, Goldman Sachs, Bloomberg, Citadel, Two Sigma | Time: 60-75 minutes
Isolation Levels Hierarchy
The SQL standard defines four isolation levels with increasing consistency:
| Level | Dirty Read | Non-Repeatable Read | Phantom Read | Serialization Anomaly |
|---|---|---|---|---|
| READ UNCOMMITTED | Possible | Possible | Possible | Possible |
| READ COMMITTED | Prevented | Possible | Possible | Possible |
| REPEATABLE READ | Prevented | Prevented | Possible* | Possible |
| SERIALIZABLE | Prevented | Prevented | Prevented | Prevented |
*PostgreSQL prevents phantoms at REPEATABLE READ via predicate locking.
-- Create banking example
CREATE TABLE accounts (
account_id INT PRIMARY KEY,
customer_name VARCHAR(100),
balance DECIMAL(15,2),
version INT DEFAULT 1
);
INSERT INTO accounts VALUES
(1, 'Alice', 10000.00, 1),
(2, 'Bob', 15000.00, 1),
(3, 'Charlie', 8000.00, 1);
-- Transaction isolation level settings
-- PostgreSQL default is READ COMMITTED
SHOW default_transaction_isolation;
-- Set isolation level for session
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- Or for next transaction only
BEGIN ISOLATION LEVEL SERIALIZABLE;
Dirty Read Demonstration
-- Session 1: READ UNCOMMITTED
BEGIN ISOLATION LEVEL READ UNCOMMITTED;
UPDATE accounts SET balance = 5000.00 WHERE account_id = 1;
-- Not committed yet
-- Session 2: READ UNCOMMITTED
BEGIN ISOLATION LEVEL READ UNCOMMITTED;
SELECT balance FROM accounts WHERE account_id = 1;
-- Returns 5000.00 (dirty read!)
-- Session 1: Rollback
ROLLBACK;
-- Session 2: Balance is now incorrect
-- Shows 5000.00 but actual is 10000.00
COMMIT;
Non-Repeatable Read Demonstration
-- Session 1: READ COMMITTED
BEGIN ISOLATION LEVEL READ COMMITTED;
SELECT balance FROM accounts WHERE account_id = 1;
-- Returns 10000.00
-- Session 2: Update and commit
BEGIN;
UPDATE accounts SET balance = 12000.00 WHERE account_id = 1;
COMMIT;
-- Session 1: Same query, different result
SELECT balance FROM accounts WHERE account_id = 1;
-- Returns 12000.00 (non-repeatable read!)
COMMIT;
Phantom Read Demonstration
-- Session 1: REPEATABLE READ
BEGIN ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM accounts WHERE balance > 9000;
-- Returns 2
-- Session 2: Insert new row
BEGIN;
INSERT INTO accounts VALUES (4, 'David', 11000.00, 1);
COMMIT;
-- Session 1: Same query, different count
SELECT COUNT(*) FROM accounts WHERE balance > 9000;
-- Returns 3 (phantom read!) in most databases
-- PostgreSQL returns 2 (prevents phantoms)
COMMIT;
MVCC Implementation
Multi-Version Concurrency Control maintains multiple versions:
-- Check MVCC system columns
SELECT
ctid, -- Physical row location (page, offset)
xmin, -- Transaction ID that created this version
xmax, -- Transaction ID that deleted/updated this version (0 if not deleted)
account_id,
balance
FROM accounts
WHERE account_id = 1;
-- Update to create new version
UPDATE accounts SET balance = 11000.00 WHERE account_id = 1;
-- Check both versions
SELECT
ctid,
xmin,
xmax,
account_id,
balance
FROM accounts
WHERE account_id = 1;
-- Check transaction visibility
SELECT
txid_current() AS current_txid,
txid_visible_in_snapshot(xmin, txid_current_snapshot()) AS created_visible,
txid_visible_in_snapshot(xmax, txid_current_snapshot()) AS deleted_visible
FROM accounts
WHERE account_id = 1;
SERIALIZABLE with Serializable Snapshot Isolation (SSI)
PostgreSQL uses SSI for SERIALIZABLE:
-- SSI prevents write skew anomaly
BEGIN ISOLATION LEVEL SERIALIZABLE;
-- Session 1: Check doctors on call
SELECT COUNT(*) FROM doctors
WHERE on_call = TRUE AND department = 'ER';
-- Returns 2
-- Session 2: Same check
BEGIN ISOLATION LEVEL SERIALIZABLE;
SELECT COUNT(*) FROM doctors
WHERE on_call = TRUE AND department = 'ER';
-- Returns 2
-- Session 1: Remove one doctor (assuming another is on call)
UPDATE doctors SET on_call = FALSE
WHERE doctor_id = 1 AND department = 'ER';
COMMIT;
-- Session 2: Also removes a doctor (write skew!)
UPDATE doctors SET on_call = FALSE
WHERE doctor_id = 2 AND department = 'ER';
-- Serialization failure! Would violate constraint
COMMIT;
Lock-Based Concurrency Control
-- Explicit locking
BEGIN;
-- Row-level lock
SELECT * FROM accounts WHERE account_id = 1 FOR UPDATE;
-- Other transactions block on this row
-- Share lock (allows reads, blocks writes)
SELECT * FROM accounts WHERE account_id = 1 FOR SHARE;
-- No-wait lock (fail immediately if locked)
SELECT * FROM accounts WHERE account_id = 1 FOR UPDATE NOWAIT;
-- Skip locked (skip rows that are locked)
SELECT * FROM accounts WHERE account_id = 1 FOR UPDATE SKIP LOCKED;
-- Advisory locks (application-level)
SELECT pg_advisory_lock(12345); -- Acquire
SELECT pg_try_advisory_lock(12345); -- Try acquire
SELECT pg_advisory_unlock(12345); -- Release
COMMIT;
Deadlock Detection and Prevention
-- Create deadlock scenario
-- Session 1
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE account_id = 1;
-- Wait for session 2 to release lock on account 2
-- Session 2
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE account_id = 2;
-- Wait for session 1 to release lock on account 1
-- DEADLOCK! One session will be rolled back
-- Check for deadlocks
SELECT * FROM pg_stat_activity WHERE wait_event_type = 'Lock';
-- Set deadlock timeout
SET deadlock_timeout = '1s';
optimistic Concurrency Control
-- Version-based OCC
CREATE TABLE products (
product_id INT PRIMARY KEY,
name VARCHAR(100),
price DECIMAL(10,2),
version INT DEFAULT 1
);
-- Optimistic update pattern
WITH current_version AS (
SELECT version FROM products WHERE product_id = 1
)
UPDATE products
SET price = 29.99, version = version + 1
WHERE product_id = 1
AND version = (SELECT version FROM current_version)
RETURNING *;
-- If no rows updated, version changed by another transaction
Isolation Level Comparison Table
| Feature | READ COMMITTED | REPEATABLE READ | SERIALIZABLE |
|---|---|---|---|
| Lock Duration | Statement | Transaction | Transaction |
| MVCC Snapshot | Per statement | Per transaction | Per transaction |
| Phantoms | Allowed | Prevented | Prevented |
| Write Skew | Allowed | Allowed | Prevented |
| Performance | Best | Good | Moderate |
| Use Case | OLTP general | Reporting | Financial |
ℹ️
PostgreSQL REPEATABLE READ: Uses snapshot isolation, not traditional locking. Prevents phantoms via predicate locking, unlike MySQL/Oracle.
Advanced: Gap Locking
-- Gap locks prevent inserts in range
BEGIN ISOLATION LEVEL REPEATABLE READ;
-- Lock gap between 100 and 200
SELECT * FROM accounts
WHERE account_id BETWEEN 100 AND 200 FOR UPDATE;
-- Other transactions cannot insert account_id in (100, 200)
-- This prevents phantom reads
INSERT INTO accounts VALUES (150, 'New', 5000.00, 1);
-- Blocks until first transaction commits
COMMIT;
Mathematical Formulation
For a set of transactions :
The serialization graph where:
- (transactions as vertices)
- if should come before
⚠️
Performance Impact: SERIALIZABLE can cause up to 10% throughput reduction due to aborts. Monitor with pg_stat_database conflicts.
Monitoring and Diagnostics
-- Check current locks
SELECT
l.pid,
l.mode,
l.granted,
a.query,
a.state
FROM pg_locks l
JOIN pg_stat_activity a ON l.pid = a.pid
WHERE l.relation = 'accounts'::regclass;
-- Check transaction ID wraparound
SELECT
datname,
age(datfrozenxid) AS xid_age,
2^31 - age(datfrozenxid) AS transactions_until_wraparound
FROM pg_database;
-- Monitor serialization failures
SELECT
datname,
conflicts,
deadlocks,
conflicts_bet_rol
FROM pg_stat_database
WHERE datname = current_database();
ℹ️
Pro Tip: Use SET LOCAL to change isolation level for current transaction only. This is useful for mixed workloads.
Best Practices
- Use READ COMMITTED for most OLTP workloads
- Use SERIALIZABLE for financial transactions
- Implement retry logic for serialization failures
- Monitor lock contention with pg_stat_activity
- Avoid long-running transactions to reduce MVCC overhead
-- Retry pattern for serialization failures
DO $$
DECLARE
max_retries INT := 3;
retry_count INT := 0;
BEGIN
LOOP
BEGIN
-- Your transaction here
PERFORM transfer_funds(1, 2, 100);
EXIT; -- Success
EXCEPTION
WHEN serialization_failure OR deadlock_detected THEN
retry_count := retry_count + 1;
IF retry_count >= max_retries THEN
RAISE;
END IF;
PERFORM pg_sleep(random() * 0.1);
END;
END LOOP;
END $$;