Infrastructure
Content Delivery Networks
A CDN distributes content to edge servers worldwide, reducing latency by serving users from the nearest location. CDNs are critical for performance at global scale.
- Edge Caching β Content served from locations near users
- DNS Routing β Users directed to nearest edge server
- Offloading β Origin server load reduced by 90%+
CDNs are the invisible infrastructure that makes the internet feel fast.
What Is a CDN?
A CDN is a distributed network of servers that caches and delivers content from edge locations close to users.
DfContent Delivery Network
A CDN (Content Delivery Network) is a geographically distributed network of proxy servers and data centers. CDNs cache static and dynamic content at edge locations, reducing latency by serving content from the nearest point of presence (PoP). They improve performance, availability, and security while reducing origin server load.
Latency Reduction
Here,
- =Network latency to edge server
- =Physical distance to nearest PoP
- =Speed of light in fiber (~200,000 km/s)
Latency Improvement
Without CDN (user in Tokyo, origin in US-East):
- Distance: ~11,000 km
- Latency: ~55ms one-way, ~110ms round-trip
With CDN (edge in Tokyo):
- Distance: ~5 km to nearest PoP
- Latency: ~0.1ms one-way, ~1ms round-trip
Improvement: 100x latency reduction
DNS-Based Routing
CDNs use DNS to direct users to the nearest edge server.
DfGeoDNS
GeoDNS (Global Server Load Balancing) resolves domain names to the IP address of the nearest edge server based on the client's geographic location. This happens at the DNS level, directing users to the optimal PoP before any HTTP connection is made.
Push vs Pull CDN
| Strategy | Description | Use Case |
|---|---|---|
| Push | Origin pushes content to edge servers | Large media files, predictable access |
| Pull | Edge servers fetch from origin on first request | Dynamic content, long-tail assets |
| Hybrid | Pull with background refresh | Most modern CDNs |
DfPush CDN
A push CDN proactively distributes content to edge servers. The origin uploads files to all edges, which cache them before users request them. Best for content that is accessed frequently and changes infrequently.
DfPull CDN
A pull CDN caches content on first request. When an edge server receives a request for uncached content, it fetches from the origin, caches it, and serves it. Subsequent requests hit the cache. Best for large sites with many assets.
Cache Strategies
DfCache-Control Headers
CDNs respect HTTP cache headers to determine caching behavior:
max-age: How long to cache (in seconds)s-maxage: CDN-specific max-age (overrides max-age for CDN)no-cache: Revalidate with origin before servingno-store: Never cachestale-while-revalidate: Serve stale content while fetching fresh
Use stale-while-revalidate for content that can be slightly stale. This serves cached content immediately while fetching updates in the background, providing both freshness and low latency.
Cache Invalidation
DfCache Invalidation
Cache invalidation is the process of removing or updating cached content before it expires. Methods include:
- Purge: Immediate removal by URL or tag
- Soft purge: Mark as stale, revalidate on next request
- Versioned URLs: Append hash to filename (e.g., app.abc123.js)
- Time-based: Natural expiration via max-age
CDN Architecture
CDN Security
| Feature | Description |
|---|---|
| DDoS Protection | Absorbs volumetric attacks at edge |
| WAF | Web Application Firewall at edge |
| TLS Termination | Handles SSL at edge |
| Token Authentication | Signed URLs for private content |
| Bot Management | Identifies and blocks malicious bots |
Modern CDNs like Cloudflare, Akamai, and AWS CloudFront provide security services beyond caching. They offer DDoS mitigation, WAF rules, bot detection, and API gateway capabilities at the edge.
Practice Exercises
-
Design: Design a CDN strategy for a video streaming platform serving 100M users globally. Consider hot content, long-tail videos, and live streaming.
-
Cache Invalidation: A news website publishes breaking news that must be globally visible within 30 seconds. Design a cache invalidation strategy using a CDN.
-
Analysis: Compare push and pull CDN strategies for an e-commerce site with 10,000 product pages and 500,000 product images.
-
Cost: A CDN charges 0.01/10,000 requests. Your site serves 1TB/month with 100M requests. Calculate the monthly CDN cost.
Key Takeaways:
- CDNs reduce latency by serving content from edge locations near users
- DNS-based routing directs users to the nearest PoP
- Pull CDNs cache on first request; push CDNs pre-distribute content
- Cache-Control headers and versioned URLs manage caching behavior
- Modern CDNs provide security services: DDoS protection, WAF, bot management
- Multi-tier architecture (edge β shield β origin) reduces origin load by 90%+
What to Learn Next
-> Proxy and Reverse Proxy Forward proxy, Nginx, HAProxy, and SSL termination.
-> Rate Limiting Token bucket, sliding window, and distributed rate limiting.
-> Load Balancing Distribution algorithms and L4 vs L7 load balancing.
-> Networking Fundamentals TCP/IP, HTTP, DNS, and network latency.
-> Security Patterns Authentication, authorization, encryption, and mTLS.
-> Cost Optimization Cloud cost management and right-sizing.