System Design Foundations
Networking Fundamentals
Every distributed system communicates over networks. Understanding networking—from the physical layer to application protocols—is essential for making informed design decisions about latency, throughput, and reliability.
- TCP/IP — Reliable, ordered byte streams over unreliable networks
- HTTP/HTTPS — The application protocol that powers the web
- DNS — The internet's phonebook for name resolution
The network is never as reliable as you think it is.
The Network Stack
Networks are organized in layers, each abstracting the details of the layer below.
The OSI and TCP/IP Models
TCP vs UDP
DfTCP (Transmission Control Protocol)
TCP provides reliable, ordered, byte-stream communication. It establishes connections via a three-way handshake, guarantees delivery with acknowledgments and retransmission, and provides flow control and congestion control. TCP is used when correctness matters more than speed.
DfUDP (User Datagram Protocol)
UDP provides unreliable, connectionless, datagram-based communication. It has lower overhead and latency than TCP because it skips connection setup, acknowledgment, and retransmission. UDP is used when speed matters more than guaranteed delivery (e.g., video streaming, DNS, gaming).
TCP Three-Way Handshake
| Feature | TCP | UDP |
|---|---|---|
| Connection | Connection-oriented | Connectionless |
| Reliability | Guaranteed delivery | Best-effort |
| Ordering | Ordered bytes | Unordered datagrams |
| Overhead | Higher (headers, state) | Lower (minimal headers) |
| Latency | Higher (handshake) | Lower (no handshake) |
| Use Cases | HTTP, SSH, databases, email | DNS, video, gaming, VoIP |
HTTP/HTTPS
HTTP is the foundation of data communication on the web.
DfHTTP (HyperText Transfer Protocol)
HTTP is an application-layer protocol for transmitting hypermedia documents. It follows a request-response model: a client sends a request (method + URL + headers + body) and the server returns a response (status code + headers + body). HTTP is stateless—each request is independent.
HTTP Methods and Semantics
| Method | Semantics | Idempotent | Safe | Use Case |
|---|---|---|---|---|
| GET | Read resource | Yes | Yes | Fetching data |
| POST | Create resource | No | No | Submitting forms, creating |
| PUT | Replace resource | Yes | No | Full updates |
| PATCH | Partial update | No | No | Partial modifications |
| DELETE | Remove resource | Yes | No | Deleting resources |
| HEAD | Metadata only | Yes | Yes | Health checks |
| OPTIONS | Capabilities | Yes | Yes | CORS preflight |
HTTP Status Codes
| Code Range | Category | Examples |
|---|---|---|
| 1xx | Informational | 100 Continue, 101 Switching Protocols |
| 2xx | Success | 200 OK, 201 Created, 204 No Content |
| 3xx | Redirection | 301 Moved Permanently, 304 Not Modified |
| 4xx | Client Error | 400 Bad Request, 401 Unauthorized, 404 Not Found |
| 5xx | Server Error | 500 Internal Server Error, 503 Service Unavailable |
HTTP/2 and HTTP/3
HTTP/2 introduced:
- Multiplexing: Multiple requests over a single TCP connection
- Header compression: HPACK reduces overhead
- Server push: Proactively send resources
HTTP/3 (QUIC-based):
- 0-RTT connection establishment: No TCP handshake delay
- No head-of-line blocking: Independent stream ordering
- Built-in encryption: TLS 1.3 integrated
HTTP/3 over QUIC eliminates TCP's head-of-line blocking problem. In HTTP/2 over TCP, a lost packet blocks ALL streams. In HTTP/3, only the affected stream is blocked. This is a significant improvement for high-latency, lossy networks.
DNS (Domain Name System)
DNS translates human-readable domain names to IP addresses.
DfDNS
DNS is a hierarchical, distributed naming system that maps domain names to IP addresses. It operates as a recursive query chain: client → recursive resolver → root nameserver → TLD nameserver → authoritative nameserver.
DNS Resolution Process
DNS Record Types
| Record | Purpose | Example |
|---|---|---|
| A | Maps domain to IPv4 | example.com → 93.184.216.34 |
| AAAA | Maps domain to IPv6 | example.com → 2606:2800:220:1:... |
| CNAME | Alias to another domain | www.example.com → example.com |
| MX | Mail exchange servers | example.com → mail.example.com |
| TXT | Text information (SPF, DKIM) | "v=spf1 include:..." |
| NS | Nameserver for domain | example.com → ns1.example.com |
| SOA | Start of authority metadata | Zone authority info |
DNS Caching
DNS uses multi-level caching for performance:
- Browser cache: Typically 60 seconds to 30 minutes
- OS cache: System-level DNS resolver cache
- ISP resolver cache: Shared across customers (TTL-based)
- Authoritative server: Source of truth
DNS TTL (Time to Live) controls how long records are cached. Short TTLs (60s) allow fast changes but increase DNS query load. Long TTLs (24h) reduce load but slow propagation. For systems requiring rapid failover, use short TTLs on A records.
Content Delivery Networks (CDNs)
CDNs cache content at edge locations closer to users to reduce latency.
DfCDN
A Content Delivery Network (CDN) is a geographically distributed network of proxy servers and data centers that delivers content to users based on their geographic location. CDNs reduce latency by serving content from the nearest edge location rather than the origin server.
CDN Cache Strategies
| Strategy | Description | Trade-off |
|---|---|---|
| Pull CDN | Edge requests from origin on cache miss | Origin controls freshness |
| Push CDN | Origin pushes content to edges | Faster but requires invalidation |
CDN Architecture
Network Latency
Understanding latency is critical for system design decisions.
Round-Trip Time (RTT)
Here,
- =Round-trip time (latency for request + response)
- =Physical distance between client and server
- =≈ 2/3 c for fiber optic ≈ 200,000 km/s
Realistic Latency Numbers
| Distance | Minimum RTT (Fiber) | Practical RTT |
|---|---|---|
| Same data center | < 1ms | 0.5 - 2ms |
| Same city (100km) | ~1ms | 2 - 5ms |
| Cross-country (4000km) | ~40ms | 50 - 80ms |
| Transatlantic (8000km) | ~80ms | 100 - 150ms |
| Transpacific (15000km) | ~150ms | 160 - 200ms |
The speed of light in fiber is approximately 200,000 km/s (2/3 of vacuum speed). A round-trip from New York to London (~5,500 km) takes at minimum 55ms just for light propagation, plus routing and processing overhead.
Practice Exercises
-
Conceptual: Why does HTTP/3 use QUIC instead of TCP? What specific problem does this solve for modern web applications?
-
Calculation: A user in Tokyo accesses a server in New York. The fiber path is 11,000 km. Calculate the minimum light propagation RTT. If the practical RTT is 140ms, what is the overhead?
-
Design: Design a DNS strategy for a service that requires < 5 minute failover. What TTL values would you use? What are the trade-offs?
-
Analysis: Compare HTTP/1.1, HTTP/2, and HTTP/3 for a single-page application that loads 50 resources. How does each version handle parallelism and head-of-line blocking?
Key Takeaways:
- TCP provides reliable, ordered delivery with connection overhead; UDP provides low-latency, best-effort delivery
- HTTP/2 introduced multiplexing; HTTP/3 (QUIC) eliminates head-of-line blocking and adds 0-RTT connections
- DNS is a hierarchical, cached system—TTL values balance freshness against query load
- CDNs reduce latency by caching content at edge locations near users
- Network latency is bounded by the speed of light—geographic distance matters
What to Learn Next
-> API Design REST, GraphQL, gRPC, versioning, and rate limiting.
-> Databases SQL vs NoSQL, indexing, replication, and sharding.
-> Caching Strategies Redis, Memcached, cache invalidation, and write strategies.
-> Load Balancing Algorithms, health checks, and L4 vs L7.
-> Message Queues Kafka, RabbitMQ, event-driven architecture.
-> Scalability Fundamentals Vertical vs horizontal scaling and capacity planning.