Data Systems
Spanner and CockroachDB
Google Spanner pioneered globally distributed SQL. CockroachDB brought those ideas to the open-source world. Master their architectures, consensus protocols, and the trade-offs between them.
- Global Transactions β Serializable ACID across regions
- Consensus β Paxos (Spanner) vs Raft (CockroachDB)
- Consistency β External consistency vs serializable
These databases prove you don't have to sacrifice consistency for global scale.
Google Spanner
DfGoogle Spanner
Google Spanner is a globally distributed, externally consistent relational database. It uses TrueTime (atomic clocks + GPS) for global clock synchronization, enabling serializable transactions across data centers worldwide. Spanner is the first database to provide globally consistent reads and writes at scale.
TrueTime
DfTrueTime
TrueTime is Spanner's clock synchronization API that returns a time interval [earliest, latest] rather than a single point. The uncertainty (TTunc) is typically less than 7ms. Spanner waits out this uncertainty interval before committing transactions, guaranteeing that all commits are ordered correctly globally.
TrueTime Commit Rule
Here,
- =Timestamp assigned to the transaction
- =TrueTime after the commit event
- =TrueTime uncertainty bound
TrueTime in Action
Transaction T1 commits in US-East at time t1=100. Transaction T2 commits in EU-West at time t2=105.
Without TrueTime, network delays could cause T2 to appear to commit before T1 globally. With TrueTime, Spanner waits out the uncertainty interval. If TTunc=7ms:
- T1's commit is not visible until t1 + 7ms = 107
- T2's commit is not visible until t2 + 7ms = 112
- T2 is guaranteed to see T1's changes because 112 > 107
This is external consistency: all observers see transactions in the same order.
Spanner Architecture
CockroachDB
DfCockroachDB
CockroachDB is a distributed SQL database inspired by Spanner. It provides serializable transactions, automatic sharding, and multi-region deployment using the Raft consensus protocol. Unlike Spanner, it runs on commodity hardware without requiring atomic clocks.
Raft Consensus
| Feature | Paxos (Spanner) | Raft (CockroachDB) |
|---|---|---|
| Understood | Complex | Easier to reason about |
| Membership | Manual | Dynamic |
| Leader election | Implicit | Explicit |
| Implementation | Custom | Reference implementation |
Spanner vs CockroachDB
| Feature | Spanner | CockroachDB |
|---|---|---|
| Consensus | Paxos | Raft |
| Time Source | TrueTime (atomic clocks) | Hybrid Logical Clocks (HLC) |
| Deployment | GCP only | Any cloud, self-hosted |
| Consistency | External consistency | Serializable |
| Latency | Lower (TrueTime) | Higher (HLC uncertainty) |
| Cost | Higher (GCP pricing) | Lower (open source) |
| Maturity | 10+ years | 7+ years |
CockroachDB's HLC (Hybrid Logical Clocks) provide clock synchronization without atomic clocks. The trade-off is higher uncertainty, which CockroachDB handles by waiting for uncertainty intervals to resolve. This results in slightly higher latency than Spanner but works on commodity hardware.
When to Use Each
| Use Case | Recommended |
|---|---|
| Global financial system on GCP | Spanner |
| Multi-cloud distributed SQL | CockroachDB |
| Need lowest latency globally | Spanner |
| Need to avoid vendor lock-in | CockroachDB |
| Already invested in GCP | Spanner |
| Need self-hosted option | CockroachDB |
Practice Exercises
-
Architecture Comparison: Compare the write paths of Spanner and CockroachDB. What happens when a write arrives in the US but the leader for that data is in Europe?
-
Transaction Design: Design a distributed transaction for transferring money between accounts in different regions. How do you handle the coordination and ensure consistency?
-
Consistency Analysis: Explain the difference between external consistency and serializability. Why does Spanner's TrueTime matter?
-
Migration Planning: Your team is migrating from PostgreSQL to CockroachDB. What are the main challenges, and how would you approach the migration?
Key Takeaways:
- Spanner uses TrueTime for globally consistent reads and writes
- CockroachDB uses Raft consensus without requiring specialized hardware
- TrueTime provides uncertainty bounds; Spanner waits them out for consistency
- CockroachDB's HLC is a practical alternative to TrueTime
- Both databases provide distributed ACID transactions
- Choose Spanner for GCP-native; CockroachDB for multi-cloud or self-hosted
What to Learn Next
-> NewSQL and Distributed SQL Overview of NewSQL databases and distributed SQL concepts.
-> PostgreSQL Deep Dive Advanced PostgreSQL features, extensions, and optimization.
-> SQL Deep Dive PostgreSQL, MySQL, indexing strategies, and query optimization.
-> Distributed Consensus Paxos, Raft, and consensus protocols.
-> Data Replication Sync vs async replication, leader election, and consistency.
-> CAP Theorem Consistency models, availability, and partition tolerance.