Serverless Architecture Patterns
Difficulty: Senior Level | Companies: AWS, Google, Microsoft, Netflix, Uber
Serverless Mental Model
Serverless isn't "no servers" โ it's abstracting server management away. You pay per invocation, scale automatically, and focus on business logic.
โน๏ธ
Serverless is ideal for event-driven, spiky, or asynchronous workloads. It's less suitable for long-running, stateful, or steady-state workloads.
Pattern 1: Fan-Out/Fan-In with Step Functions
Process items in parallel and aggregate results.
// Step Functions state machine definition
const parallelProcessingStateMachine = {
Comment: 'Fan-out/fan-in for parallel data processing',
StartAt: 'ReceiveBatch',
States: {
ReceiveBatch: {
Type: 'Task',
Resource: 'arn:aws:lambda:us-east-1:123456789:function:receive-batch',
Next: 'DistributeWork',
},
DistributeWork: {
Type: 'Map',
ItemsPath: '$.items',
MaxConcurrency: 100,
Iterator: {
StartAt: 'ProcessItem',
States: {
ProcessItem: {
Type: 'Task',
Resource: 'arn:aws:lambda:us-east-1:123456789:function:process-item',
Retry: [
{
ErrorEquals: ['States.TaskFailed'],
IntervalSeconds: 2,
MaxAttempts: 3,
BackoffRate: 2,
},
],
End: true,
},
},
},
Next: 'AggregateResults',
},
AggregateResults: {
Type: 'Task',
Resource: 'arn:aws:lambda:us-east-1:123456789:function:aggregate',
Next: 'SendNotification',
},
SendNotification: {
Type: 'Task',
Resource: 'arn:aws:sns:us-east-1:123456789:processing-complete',
End: true,
},
},
};
Pattern 2: Lambda with Provisioned Concurrency
Avoid cold starts for latency-sensitive applications.
# SAM template for provisioned concurrency
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:
ApiFunction:
Type: AWS::Serverless::Function
Properties:
Handler: index.handler
Runtime: nodejs20.x
CodeUri: src/
MemorySize: 1024
Timeout: 10
ProvisionedConcurrencyConfig:
ProvisionedConcurrentExecutions: 10
AutoPublishAlias: live
DeploymentPreference:
Type: Canary10Percent5Minutes
Alarms:
- !Ref FunctionErrorAlarm
Environment:
Variables:
NODE_ENV: production
Events:
Api:
Type: Api
Properties:
Path: /{proxy+}
Method: ANY
# Auto-scaling for provisioned concurrency
ScalableTarget:
Type: AWS::ApplicationAutoScaling::ScalableTarget
Properties:
MaxCapacity: 100
MinCapacity: 10
ResourceId: !Sub function:${ApiFunction}:live
ScalableDimension: lambda:function:ProvisionedConcurrency
ServiceNamespace: lambda
ScheduledActions:
- ScheduledActionName: ScaleUpMorning
Schedule: "cron(0 8 ? * MON-FRI *)"
ScalableTargetAction:
MinCapacity: 50
- ScheduledActionName: ScaleDownEvening
Schedule: "cron(0 20 ? * MON-FRI *)"
ScalableTargetAction:
MinCapacity: 10
โ ๏ธ
Provisioned concurrency costs ~40% more than on-demand. Use it only for latency-critical paths and scale automatically based on time or metrics.
Pattern 3: Lambda Layers for Shared Dependencies
Share common code across multiple Lambda functions.
# requirements-layer/
# requirements.txt
boto3==1.28.0
requests==2.31.0
pydantic==2.0.0
# Build the layer
# pip install -r requirements.txt -t python/lib/python3.11/site-packages/
# zip -r layer.zip python/
# Lambda function using the layer
import sys
sys.path.insert(0, '/opt/python/lib/python3.11/site-packages')
from pydantic import BaseModel
from typing import List
import boto3
import requests
class OrderItem(BaseModel):
product_id: str
quantity: int
price: float
class OrderRequest(BaseModel):
customer_id: str
items: List[OrderItem]
total: float
def handler(event, context):
order = OrderRequest(**event)
# Validate with Pydantic
if order.total != sum(i.quantity * i.price for i in order.items):
raise ValueError("Total mismatch")
# External API call
response = requests.post(
'https://api.example.com/orders',
json=order.dict(),
timeout=5
)
return {
'statusCode': response.status_code,
'body': response.json()
}
Pattern 4: Async Processing with SQS + Lambda
Decouple producers from consumers with SQS.
// SQS queue configuration with DLQ
const sqsConfig = {
QueueName: 'order-processing-queue',
DelaySeconds: 0,
MessageRetentionPeriod: 1209600, // 14 days
VisibilityTimeout: 300, // 5 minutes
RedrivePolicy: {
deadLetterTargetArn: 'arn:aws:sqs:us-east-1:123456789:dlq-order-processing',
maxReceiveCount: 3,
},
Tags: [
{ Key: 'Environment', Value: 'production' },
{ Key: 'Service', Value: 'order-processing' },
],
};
// Lambda handler with batch processing
exports.handler = async (event) => {
const batchItemFailures = [];
for (const record of event.Records) {
try {
const order = JSON.parse(record.body);
await processOrder(order);
} catch (error) {
console.error(`Failed to process ${record.messageId}:`, error);
batchItemFailures.push({
itemIdentifier: record.messageId,
});
}
}
// Return partial batch failure support
return { batchItemFailures };
};
async function processOrder(order) {
// Simulate processing
await db.orders.update({
where: { id: order.id },
data: { status: 'processed', processedAt: new Date() },
});
}
Pattern 5: Lambda Cold Start Optimization
Reduce cold start times with these techniques.
# Optimized Lambda for fast cold starts
# 1. Global scope initialization (reused across invocations)
import boto3
import json
# Initialize outside handler - reused across warm starts
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('orders')
# 2. Connection reuse
import requests
session = requests.Session() # Reuse connection pool
def handler(event, context):
# 3. Lazy import for rarely-used modules
if event.get('needs_ml'):
import ml_model # Only import when needed
result = ml_model.predict(event['data'])
# 4. Keep payload small
order_id = event['pathParameters']['id']
response = table.get_item(
Key={'id': order_id},
ProjectionExpression='id, status, total'
)
return {
'statusCode': 200,
'headers': {
'Content-Type': 'application/json',
'Cache-Control': 'public, max-age=60',
},
'body': json.dumps(response.get('Item', {})),
}
โน๏ธ
Cold starts add 100ms-2s latency depending on runtime. Java and .NET have the longest cold starts; Python and Node.js are fastest.
Cost Comparison
| Pattern | Monthly Cost (1M requests) | Cold Start | Best For |
|---|---|---|---|
| On-demand Lambda | $0.20 + compute | 100-500ms | Spiky workloads |
| Provisioned Concurrency | 15 provisioned | <10ms | Latency-sensitive |
| Lambda + SQS | 0.40/1M msgs | 100-500ms | Async processing |
| Step Functions | $0.025/1K transitions | N/A | Complex workflows |
Follow-Up Questions
- How do you handle state management in a serverless workflow that requires human approval steps?
- What are the trade-offs between Lambda, Fargate, and EC2 for a batch processing workload?
- How would you implement circuit breaker patterns in a serverless microservices architecture?