System Design — Data Systems

Caching Strategies

Caching is the single most impactful performance optimization in system design. By storing frequently accessed data in fast storage (memory), you can reduce latency from milliseconds to microseconds and dramatically reduce database load.

Cache-Aside — Application manages cache explicitly
Write-Through — Writes go to cache and database simultaneously
Redis vs Memcached — When to use each caching technology

There are only two hard things in computer science: cache invalidation and naming things. — Phil Karlton

What Is Caching?

DfCaching

Caching is the technique of storing a copy of data in a faster-access storage layer (typically memory) so that future requests for that data can be served without accessing the original, slower data source. The fundamental trade-off is space and consistency in exchange for speed.

Where Caches Live

Cache Topologies

DfLocal Cache

A local cache (in-process) stores data in the application's memory space. It's the fastest option (nanosecond access) but is limited to a single process and doesn't share state across instances.

DfDistributed Cache

A distributed cache stores data across multiple networked servers. It provides shared state across application instances and can scale independently, but introduces network latency (microsecond to millisecond access).

Property	Local Cache	Distributed Cache
Latency	Nanoseconds	Microseconds-milliseconds
Scope	Single process	All instances
Invalidation	Simple	Complex (broadcast needed)
Memory	Limited by process	Scales independently
Consistency	Eventual	Configurable

Cache Eviction Policies

When the cache is full, eviction policies determine which entries to remove:

Policy	Description	Trade-off
LRU	Least Recently Used	Good general-purpose, may evict rarely-used but important items
LFU	Least Frequently Used	Better for skewed distributions, slower to update
FIFO	First In First Out	Simplest, no frequency/recency awareness
TTL	Time To Live	Expires after fixed duration, may evict useful data
Random	Random eviction	Surprisingly effective, no tracking overhead

Cache Hit Ratio

\text{Hit Ratio} = \frac{\text{Cache Hits}}{\text{Cache Hits} + \text{Cache Misses}}

Here,

$Cache Hits$ =Requests served from cache
$Cache Misses$ =Requests that required fetching from origin

Impact of Hit Ratio on Latency

Consider a system with:

Database latency: 10ms
Cache latency: 0.5ms
95% cache hit ratio

Average latency = 0.95 × 0.5ms + 0.05 × 10ms = 0.475ms + 0.5ms = 0.975ms

Without caching: 10ms average With caching: ~1ms average — a 10x improvement

At 99% hit ratio: 0.99 × 0.5 + 0.01 × 10 = 0.595ms — a 17x improvement

Cache-Aside (Lazy Loading)

The most common caching pattern:

Flow:

Application checks cache for data
Cache hit: Return data directly from cache
Cache miss: Fetch from database, store in cache, return data
On write: Update database, invalidate cache (don't update cache)

In cache-aside, always invalidate (delete) the cache on writes rather than updating it. This avoids race conditions where a read-modify-write cycle on the cache overwrites a concurrent database update.

Write-Through

Writes go to both cache and database simultaneously:

Pros: Cache is always consistent, simple to implement Cons: Write latency is cache + DB, cache may be filled with data never read

Write-Back (Write-Behind)

Writes go to cache first, then asynchronously to database:

Pros: Very fast writes, batched database writes Cons: Data loss risk if cache fails before flush, complex implementation

Redis vs Memcached

Feature	Redis	Memcached
Data structures	Strings, lists, sets, hashes, sorted sets, streams	Strings only
Persistence	RDB + AOF	None
Replication	Built-in primary-replica	None
Cluster mode	Redis Cluster	Client-side sharding
Memory efficiency	Higher overhead per key	Very memory efficient
Pub/Sub	Built-in	None
Lua scripting	Supported	Not supported
Best for	Complex data, persistence, pub/sub	Simple key-value caching

Redis is more than a cache—it's an in-memory data store with rich data structures. Use Redis when you need sorted sets (leaderboards), lists (queues), or pub/sub. Use Memcached when you need simple, memory-efficient key-value caching at massive scale.

Cache Patterns in Practice

Cache Stampede Prevention

When a popular cache entry expires, many requests simultaneously try to rebuild it:

Solutions to cache stampede:

Locking: Only one thread rebuilds the cache; others wait
Early refresh: Refresh cache before TTL expires (e.g., at 80% TTL)
Probabilistic early expiration: Randomly trigger refresh before TTL
Request coalescing: Deduplicate concurrent requests for the same key

Cache Invalidation Strategies

Strategy	Description	Consistency
TTL-based	Expires after fixed duration	Eventual
Event-driven	Invalidate on write events	Near real-time
Version-based	Cache key includes version	Strong
Tag-based	Group related keys for bulk invalidation	Configurable

Practice Exercises

Design: Design a caching strategy for a news website with 10M daily visitors. What would you cache? What TTL would you use? How do you handle breaking news?
Analysis: Compare cache-aside, write-through, and write-back for a banking application. Which pattern is most appropriate and why?
Debugging: A system has a 99% cache hit ratio but still shows high latency at p99. What could cause this? How would you diagnose it?
Architecture: Design a distributed caching layer for a global e-commerce platform. Consider: data locality, consistency, failure handling, and warming the cache.

Key Takeaways:

Caching reduces latency by orders of magnitude; a 99% hit ratio can yield 10-20x improvement
Cache-aside is the most common pattern; always invalidate on writes, don't update
Write-through provides consistency; write-back provides performance; choose based on requirements
Redis is a feature-rich data store; Memcached is simple, memory-efficient caching
Cache stampede, invalidation, and consistency are the hardest problems in caching

What to Learn Next

-> Databases SQL vs NoSQL, indexing, replication, and sharding.

-> Load Balancing Algorithms, health checks, and L4 vs L7.

-> Message Queues Kafka, RabbitMQ, event-driven architecture.

-> CAP Theorem Consistency models, availability, and partition tolerance.

-> Microservices Service decomposition, discovery, and API gateways.

-> Scalability Fundamentals Vertical vs horizontal scaling and capacity planning.

Caching Strategies

Caching Strategies

What Is Caching?

DfCaching

Where Caches Live

Cache Topologies

DfLocal Cache

DfDistributed Cache

Cache Eviction Policies

Cache Hit Ratio

Impact of Hit Ratio on Latency

Cache-Aside (Lazy Loading)

Write-Through

Write-Back (Write-Behind)

Redis vs Memcached

Cache Patterns in Practice

Cache Stampede Prevention

Cache Invalidation Strategies

Practice Exercises

What to Learn Next

Premium Content

Need Expert System Design Help?