Architecture
WebSockets and Real-Time
WebSockets enable bidirectional communication between client and server. Master the WebSocket protocol, connection management, scaling strategies, and the patterns behind real-time features.
- Bidirectional β Both client and server can send messages
- Persistent β Connection stays open for low latency
- Scalable β Horizontal scaling with pub/sub backends
WebSockets turn HTTP's request-response into a real-time conversation.
WebSocket Protocol
DfWebSocket
WebSocket is a communication protocol that provides full-duplex communication over a single TCP connection. It starts as an HTTP handshake (Upgrade header) then switches to a persistent, bidirectional connection. WebSockets are designed for low-latency, high-throughput communication between client and server.
HTTP vs WebSocket
| Aspect | HTTP | WebSocket |
|---|---|---|
| Communication | Request-response | Bidirectional |
| Connection | Short-lived | Persistent |
| Latency | High (headers) | Low (minimal framing) |
| Server Push | Polling/SSE | Native support |
| Use Case | REST APIs, CRUD | Chat, games, live updates |
WebSockets are not always better than HTTP. For request-response patterns, HTTP/2 with multiplexing is often sufficient. WebSockets shine when you need real-time updates or bidirectional communication.
Connection Lifecycle
Scaling WebSockets
The Challenge
WebSockets are stateful connections. Unlike HTTP requests that can go to any server, WebSocket connections must be maintained on the same server. This makes horizontal scaling more complex.
Scaling Architecture
Connection Management
| Pattern | Description |
|---|---|
| Heartbeat | Regular pings to detect dead connections |
| Reconnection | Exponential backoff on disconnect |
| Message Queueing | Buffer messages during disconnect |
| Connection Limits | Per-server and per-user limits |
WebSocket Connection Capacity
Here,
- =Maximum connections per server
- =Memory available for connections
- =Memory per WebSocket connection (~10KB)
Connection Capacity Planning
Server with 16GB RAM, 4GB reserved for OS:
C_max = (16GB - 4GB) / 10KB = ~1.2M connections per server
For 10M concurrent users: ceil(10M / 1.2M) = 9 WebSocket servers minimum.
Real-Time Patterns
| Pattern | Description | Use Case |
|---|---|---|
| Pub/Sub | Broadcast to topic subscribers | Chat rooms, live feeds |
| Presence | Track online/offline status | User presence indicators |
| Cursor sync | Broadcast cursor positions | Collaborative editing |
| Live counters | Real-time count updates | Like counts, view counts |
For production WebSocket systems, always implement: heartbeat detection, exponential backoff reconnection, message acknowledgment, and graceful degradation to long-polling for older browsers.
Practice Exercises
-
Scaling Design: Design a WebSocket system for 10M concurrent users in a chat application. How many servers, what pub/sub backend, and how do you handle server failures?
-
Protocol Design: Design the message protocol for a real-time collaborative document editor. Include message types, acknowledgment, and conflict resolution.
-
Reconnection Strategy: Implement an exponential backoff reconnection strategy. What are the initial delay, max delay, jitter, and max retries?
-
Performance Analysis: Compare WebSocket, SSE, and HTTP polling for a live sports score update feature. What are the trade-offs in terms of latency, bandwidth, and complexity?
Key Takeaways:
- WebSockets provide full-duplex, persistent connections over a single TCP connection
- Scaling requires sticky sessions or pub/sub backends for cross-server messaging
- Heartbeat detection and exponential backoff reconnection are essential
- Each WebSocket connection uses ~10KB of memory
- Pub/sub (Redis/Kafka) enables message distribution across WebSocket servers
- Choose WebSocket for bidirectional real-time; SSE for server-to-client only
What to Learn Next
-> Elixir and Phoenix for Real-Time Building real-time systems with the BEAM VM and Phoenix Channels.
-> gRPC and Protobuf High-performance RPC with protocol buffers.
-> Chat System Design Designing real-time chat systems at scale.
-> Message Queues Async processing, event-driven architecture, and pub/sub patterns.
-> Load Balancing Distribution algorithms and L4 vs L7 load balancing.
-> Realtime Analytics Design Designing real-time analytics dashboards.