Cloud Migration Strategies
Difficulty: Senior Level | Companies: AWS, Google, Microsoft, Netflix, Uber
The 6 Rs of Migration
Every workload should be evaluated against six migration strategies. Choose based on business value, technical complexity, and timeline.
โน๏ธ
Most migrations use a combination of the 6 Rs. Start with low-risk workloads to build confidence, then tackle complex systems.
Migration Strategies Overview
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 6 Rs Migration Strategies โ
โโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโค
โ Rehost โ Replatform โ Repurchase โ โ
โ (Lift & โ (Lift & โ (Move to โ โ
โ Shift) โ Reshape) โ SaaS) โ โ
โ โ โ โ โ
โ โข VMs to โ โข Managed โ โข On-prem โ โ
โ EC2 โ databases โ CRM to โ โ
โ โข Minimal โ โข Containersโ Salesforceโ โ
โ changes โ โข Serverlessโ โ โ
โโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโค
โ Refactor โ Retire โ Retain โ โ
โ (Re- โ (Decom- โ (Keep on- โ โ
โ architect) โ mission) โ prem) โ โ
โ โ โ โ โ
โ โข Micro- โ โข Unused โ โข Regulatoryโ โ
โ services โ systems โ reasons โ โ
โ โข Cloud- โ โข Legacy โ โข Active โ โ
โ native โ apps โ legacy โ โ
โโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโ
Pattern 1: Migration Assessment Framework
Evaluate workloads for migration strategy.
# Migration assessment scoring
from dataclasses import dataclass
from typing import List
from enum import Enum
class MigrationStrategy(Enum):
REHOST = "rehost"
REPLATFORM = "replatform"
REFACTOR = "refactor"
REPURCHASE = "repurchase"
RETIRE = "retire"
RETAIN = "retain"
@dataclass
class WorkloadAssessment:
name: str
business_criticality: int # 1-5
technical_complexity: int # 1-5
dependencies: int # count
data_sensitivity: str # low, medium, high
regulatory_requirements: List[str]
current_cost: float
cloud_readiness: int # 1-5
class MigrationAssessor:
def assess_workload(self, workload: WorkloadAssessment) -> dict:
"""Calculate migration strategy recommendation."""
# Calculate scores for each strategy
scores = {}
# Rehost scoring (favors simple, low-criticality)
scores[MigrationStrategy.REHOST] = (
(6 - workload.business_criticality) * 2 +
(6 - workload.technical_complexity) * 3 +
(6 - min(workload.dependencies, 5)) * 2 +
workload.cloud_readiness
)
# Replatform scoring (favors moderate complexity)
scores[MigrationStrategy.REPLATFORM] = (
workload.business_criticality * 2 +
workload.technical_complexity * 2 +
workload.cloud_readiness * 3
)
# Refactor scoring (favors high-value, complex)
scores[MigrationStrategy.REFACTOR] = (
workload.business_criticality * 3 +
workload.technical_complexity * 3 +
workload.cloud_readiness * 2
)
# Determine recommended strategy
recommended = max(scores, key=scores.get)
# Adjust for constraints
if 'PCI-DSS' in workload.regulatory_requirements:
if recommended == MigrationStrategy.REHOST:
recommended = MigrationStrategy.REPLATFORM
if workload.data_sensitivity == 'high' and workload.dependencies > 10:
recommended = MigrationStrategy.RETAIN
return {
'workload': workload.name,
'recommended_strategy': recommended.value,
'scores': {k.value: v for k, v in scores.items()},
'estimated_timeline': self.estimate_timeline(recommended, workload),
'estimated_cost': self.estimate_cost(recommended, workload),
}
def estimate_timeline(self, strategy: MigrationStrategy, workload: WorkloadAssessment) -> str:
timelines = {
MigrationStrategy.REHOST: '2-4 weeks',
MigrationStrategy.REPLATFORM: '1-3 months',
MigrationStrategy.REFACTOR: '3-12 months',
MigrationStrategy.REPURCHASE: '1-6 months',
MigrationStrategy.RETIRE: '1-2 weeks',
MigrationStrategy.RETAIN: 'N/A',
}
return timelines[strategy]
Pattern 2: Phased Migration Plan
Execute migration in controlled phases.
# Migration phases
phases:
phase_1_foundation:
name: "Foundation & Non-Critical"
duration: "Month 1-2"
workloads:
- development_environments
- internal_tools
- test_systems
activities:
- Set up AWS Organizations
- Configure networking (VPC, Direct Connect)
- Implement IAM and security baseline
- Deploy CI/CD pipelines
success_criteria:
- Network connectivity verified
- Security controls in place
- Dev environments operational
phase_2_data_migrations:
name: "Data & Databases"
duration: "Month 2-4"
workloads:
- data_warehouses
- analytics_platforms
- reporting_systems
activities:
- Set up data lake on S3
- Migrate databases using DMS
- Implement CDC for real-time sync
- Validate data integrity
success_criteria:
- Data migration complete
- No data loss verified
- Performance benchmarks met
phase_3_application_migration:
name: "Application Workloads"
duration: "Month 3-8"
workloads:
- web_applications
- api_services
- batch_processing
activities:
- Containerize applications
- Deploy to EKS/ECS
- Migrate stateful services
- Implement observability
success_criteria:
- All apps running in cloud
- Performance equivalent or better
- Monitoring and alerting active
phase_4_optimization:
name: "Optimization & Decommission"
duration: "Month 6-12"
workloads:
- legacy_systems
- remaining_on_prem
activities:
- Optimize cloud resources
- Decommission on-premises
- Cost optimization
- Documentation
success_criteria:
- On-prem decommissioned
- Cost targets achieved
- Full cloud operation
โน๏ธ
Migrations typically take 12-24 months. Plan for 3-6 months of optimization after initial migration.
Pattern 3: Database Migration with DMS
Migrate databases with minimal downtime.
# AWS DMS migration setup
import boto3
dms = boto3.client('dms')
# Create replication instance
dms.create_replication_instance(
ReplicationInstanceId='migration-instance',
ReplicationInstanceClass='dms.r5.large',
AllocatedStorage=100,
MultiAZ=True,
EngineVersion='3.5.1',
)
# Create source endpoint (on-premises)
dms.create_endpoint(
EndpointIdentifier='onprem-postgres',
EndpointType='source',
EngineName='postgres',
ServerName='onprem-db.example.com',
Port=5432,
DatabaseName='production',
Username='dms_user',
Password='****',
SslMode='require',
)
# Create target endpoint (AWS)
dms.create_endpoint(
EndpointIdentifier='aws-rds-postgres',
EndpointType='target',
EngineName='postgres',
ServerName='prod-db.cluster-xxx.us-east-1.rds.amazonaws.com',
Port=5432,
DatabaseName='production',
Username='dms_user',
Password='****',
)
# Create migration task
dms.create_replication_task(
ReplicationTaskIdentifier='full-load-cdc',
SourceEndpointArn='arn:aws:dms:us-east-1:123:endpoint:onprem-postgres',
TargetEndpointArn='arn:aws:dms:us-east-1:123:endpoint:aws-rds-postgres',
ReplicationInstanceArn='arn:aws:dms:us-east-1:123:rep:migration-instance',
MigrationType='full-load-and-cdc',
TableMappings="""{
"rules": [
{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "all-tables",
"object-locator": {
"schema-name": "public",
"table-name": "%"
},
"rule-action": "include"
}
]
}""",
ReplicationTaskSettings="""{
"TargetMetadata": {
"TargetSchema": "",
"SupportLobs": true,
"FullLobMode": false,
"LobChunkSize": 64
},
"FullLoadSettings": {
"TargetTablePrepMode": "DROP_AND_CREATE",
"CreatePkAfterFullLoad": true
},
"Logging": {
"EnableLogging": true,
"LogComponents": [
{"Id": "TRANSFORMATION", "Severity": "LOGGER_SEVERITY_DEFAULT"},
{"Id": "SOURCE_UNLOAD", "Severity": "LOGGER_SEVERITY_DEFAULT"},
{"Id": "TARGET_LOAD", "Severity": "LOGGER_SEVERITY_DEFAULT"}
]
}
}""",
)
Pattern 4: Application Containerization
Containerize legacy applications for cloud deployment.
# Multi-stage Dockerfile for legacy Java app
FROM maven:3.9-eclipse-temurin-17 AS build
WORKDIR /app
COPY pom.xml .
RUN mvn dependency:go-offline
COPY src ./src
RUN mvn package -DskipTests
FROM eclipse-temurin:17-jre-jammy
WORKDIR /app
# Copy application
COPY --from=build /app/target/*.jar app.jar
# Copy configuration
COPY config/ /app/config/
# Health check
HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
CMD curl -f http://localhost:8080/health || exit 1
# Run application
ENTRYPOINT ["java", "-jar", "app.jar"]
# Kubernetes deployment for containerized app
apiVersion: apps/v1
kind: Deployment
metadata:
name: legacy-app
spec:
replicas: 3
selector:
matchLabels:
app: legacy-app
template:
metadata:
labels:
app: legacy-app
spec:
containers:
- name: app
image: 123456789.dkr.ecr.us-east-1.amazonaws.com/legacy-app:latest
ports:
- containerPort: 8080
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: url
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "1Gi"
โ ๏ธ
Containerization requires understanding application dependencies. Test thoroughly in staging before production deployment.
Pattern 5: Post-Migration Optimization
Optimize cloud resources after migration.
# Post-migration optimization checklist
class PostMigrationOptimizer:
def optimize_compute(self):
"""Right-size compute resources after migration."""
recommendations = []
# Analyze EC2 utilization
cloudwatch = boto3.client('cloudwatch')
instances = self.get_running_instances()
for instance in instances:
cpu_avg = self.get_metric_average(
'AWS/EC2', 'CPUUtilization',
instance['InstanceId'], days=14
)
if cpu_avg < 20:
recommendations.append({
'instance': instance['InstanceId'],
'current_type': instance['InstanceType'],
'cpu_utilization': cpu_avg,
'action': 'Downsize or convert to Spot',
})
return recommendations
def optimize_storage(self):
"""Optimize storage costs."""
s3 = boto3.client('s3')
# Enable Intelligent Tiering
buckets = s3.list_buckets()['Buckets']
for bucket in buckets:
lifecycle = s3.get_bucket_lifecycle_configuration(
Bucket=bucket['Name']
)
if 'IntelligentTiering' not in str(lifecycle):
print(f"Enable Intelligent Tiering for {bucket['Name']}")
def implement_auto_scaling(self):
"""Add auto-scaling for variable workloads."""
# Configure application auto scaling
aas = boto3.client('application-autoscaling')
aas.register_scalable_target(
ServiceNamespace='ecs',
ResourceId='service/cluster/service-name',
ScalableDimension='ecs:service:DesiredCount',
MinCapacity=2,
MaxCapacity=20,
)
# Target tracking scaling
aas.put_scaling_policy(
PolicyName='cpu-tracking',
ServiceNamespace='ecs',
ResourceId='service/cluster/service-name',
ScalableDimension='ecs:service:DesiredCount',
PolicyType='TargetTrackingScaling',
TargetTrackingScalingPolicyConfiguration={
'TargetValue': 70.0,
'PredefinedMetricSpecification': {
'PredefinedMetricType': 'ECSServiceAverageCPUUtilization',
},
'ScaleInCooldown': 300,
'ScaleOutCooldown': 60,
},
)
Migration Success Metrics
| Metric | Target | Measurement |
|---|---|---|
| Migration Completion | 100% | Workloads migrated |
| Application Availability | 99.9% | Uptime during migration |
| Performance Degradation | <5% | Response time comparison |
| Data Integrity | 100% | No data loss |
| Cost vs On-Prem | โค120% | Monthly cloud spend |
Migration Checklist
- Assessment - Evaluate all workloads with 6 Rs
- Foundation - Set up landing zone, security, networking
- Pilot - Migrate non-critical workloads first
- Execute - Follow phased migration plan
- Validate - Test functionality and performance
- Optimize - Right-size and implement cost controls
- Decommission - Remove on-premises infrastructure
Follow-Up Questions
- How do you handle data residency requirements when migrating to a multi-region cloud architecture?
- What strategies would you use to migrate a monolithic application to microservices during cloud migration?
- How do you measure migration success and justify cloud investment to stakeholders?