System Design - Infrastructure
Message Queues
Message queues decouple producers from consumers, enabling asynchronous communication between services. They are the backbone of scalable, resilient, event-driven architectures.
- Kafka - Distributed event streaming platform for high-throughput data
- RabbitMQ - Traditional message broker with flexible routing
- Event-Driven - Architecture pattern where events drive system behavior
The best distributed system is the one where components do not need to know about each other.
What Are Message Queues?
DfMessage Queue
A message queue is a middleware component that enables asynchronous communication between services by storing messages in a buffer until they are consumed. Producers send messages to the queue, and consumers read from it. This decouples the sender from the receiver in both time and space.
Why Use Message Queues?
Benefits of Message Queues
| Benefit | Description |
|---|---|
| Decoupling | Producer and consumer do not need to know about each other |
| Asynchronous | Producer does not wait for consumer to process |
| Buffering | Queue absorbs traffic spikes, protecting downstream services |
| Scalability | Add consumers independently of producers |
| Resilience | If consumer fails, messages persist in queue |
| Ordering | Maintains message order within a partition or queue |
Kafka vs RabbitMQ
DfApache Kafka
Kafka is a distributed event streaming platform designed for high-throughput, fault-tolerant, durable event streaming. It stores events as an immutable, append-only log on disk, allowing multiple consumers to read the same events independently.
DfRabbitMQ
RabbitMQ is a traditional message broker implementing the AMQP protocol. It supports complex routing, message acknowledgment, and multiple messaging patterns (point-to-point, pub/sub, request-reply). Messages are deleted after consumption.
| Feature | Kafka | RabbitMQ |
|---|---|---|
| Model | Distributed log | Message broker |
| Storage | Durable, append-only log | In-memory (persistent optional) |
| Message retention | Configurable (time/size) | Deleted after acknowledgment |
| Consumer model | Pull (consumers poll) | Push (broker delivers) |
| Ordering | Guaranteed within partition | Guaranteed within queue |
| Throughput | Millions of msgs/sec | Tens of thousands/sec |
| Latency | Milliseconds | Microseconds |
| Protocol | Custom binary | AMQP, MQTT, STOMP |
| Best for | Event streaming, data pipelines, logging | Task queues, RPC, complex routing |
Kafka Architecture
Messaging Patterns
Point-to-Point (Queue)
DfPoint-to-Point
In point-to-point messaging, a message is sent by one producer and consumed by exactly one consumer. The message is removed from the queue after consumption. This is ideal for task distribution where each task should be processed exactly once.
Publish-Subscribe (Pub/Sub)
DfPublish-Subscribe
In publish-subscribe messaging, a message is published by a producer and delivered to all subscribers (consumers). Each consumer receives a copy of the message. This is ideal for broadcasting events to multiple interested parties.
Event Sourcing
DfEvent Sourcing
Event sourcing is an architectural pattern where state changes are stored as an immutable sequence of events. Instead of storing current state, the system stores the full history of changes and derives current state by replaying events.
Event sourcing pairs naturally with Kafka. Each topic is an event log. Consumers can replay events from any offset, enabling new services to build their state from the event history without impacting existing consumers.
Message Delivery Guarantees
| Guarantee | Description | Implementation |
|---|---|---|
| At-most-once | Message may be lost, never duplicated | Fire and forget, no ack |
| At-least-once | Message may be duplicated, never lost | Ack after processing, retry |
| Exactly-once | Message delivered exactly once | Idempotent producers, transactional consumers |
Message Throughput
Here,
- =Total messages consumed in time window
- =Duration of measurement
- =Messages consumed per poll/batch
Dead Letter Queues
When a message cannot be processed after a configured number of retries, it is moved to a Dead Letter Queue (DLQ) for investigation.
A Dead Letter Queue prevents poison messages from blocking the main queue. Messages in the DLQ can be inspected, debugged, and reprocessed after fixing the underlying issue.
Practice Exercises
-
Design: Design an event-driven order processing system for an e-commerce platform. Orders flow through: payment, inventory check, shipping, and notification. Use Kafka or RabbitMQ and justify your choice.
-
Trade-offs: Compare Kafka and RabbitMQ for: (a) real-time log aggregation, (b) background job processing, (c) IoT sensor data ingestion. Which would you choose for each and why?
-
Architecture: Design a system that processes 1 million events per second with exactly-once delivery semantics. What components would you need? What are the failure modes?
-
Analysis: Your RabbitMQ queue is accumulating messages faster than consumers can process them. What are 5 strategies to handle this situation?
Key Takeaways:
- Message queues decouple producers from consumers, enabling asynchronous, resilient communication
- Kafka excels at high-throughput event streaming with durable, replayable logs
- RabbitMQ excels at traditional message brokering with complex routing and low latency
- Choose delivery guarantees carefully: at-least-once with idempotent consumers is the pragmatic default
- Dead letter queues prevent poison messages from blocking processing
What to Learn Next
-> Microservices Service decomposition, discovery, and API gateways.
-> CAP Theorem Consistency models, availability, and partition tolerance.
-> Load Balancing Algorithms, health checks, and L4 vs L7.
-> Databases SQL vs NoSQL, indexing, replication, and sharding.
-> Caching Strategies Redis, Memcached, cache invalidation, and write strategies.
-> API Design REST, GraphQL, gRPC, versioning, and rate limiting.