πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Data Migration on AWS

AWS Data EngineeringDMS, Snowball & Transfer Family⭐ Premium

Advertisement

🚚 Data Migration on AWS

Master DMS, Snowball, Transfer Family, and migration strategies.

Module: AWS Data Engineering β€’ Topic 33 of 65 β€’ Premium Content

Migration Strategies

Architecture Diagram
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    DATA MIGRATION STRATEGIES                                 β”‚
β”‚                                                                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚  1. DMS (Database Migration Service)                                β”‚    β”‚
β”‚  β”‚     Small-medium databases (<10 TB)                                 β”‚    β”‚
β”‚  β”‚     Continuous replication (CDC)                                     β”‚    β”‚
β”‚  β”‚     Source: RDS, MySQL, PostgreSQL, Oracle                          β”‚    β”‚
β”‚  β”‚     Target: RDS, Redshift, S3, Aurora                               β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚                                                                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚  2. SNOWBALL (Physical Device)                                      β”‚    β”‚
β”‚  β”‚     Large datasets (>10 TB)                                         β”‚    β”‚
β”‚  β”‚     Network-bound transfers                                         β”‚    β”‚
β”‚  β”‚     Snowball Edge: 80 TB / Snowball: 80 TB                         β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚                                                                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚  3. TRANSFER FAMILY (SFTP/FTPS/FTP)                                β”‚    β”‚
β”‚  β”‚     Replace on-premises SFTP servers                                β”‚    β”‚
β”‚  β”‚     Managed file transfer                                           β”‚    β”‚
β”‚  β”‚     Integration with S3                                             β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚                                                                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚  4. DATA SYNC (Large-scale file sync)                               β”‚    β”‚
β”‚  β”‚     On-premises to S3                                               β”‚    β”‚
β”‚  β”‚     S3 to S3 cross-region                                           β”‚    β”‚
β”‚  β”‚     Automatic encryption and compression                            β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

DMS Configuration

import boto3

dms = boto3.client('dms')

# Create replication instance
response = dms.create_replication_instance(
    ReplicationInstanceIdentifier='migration-instance',
    ReplicationInstanceClass='dms.r5.xlarge',
    AllocatedStorage=100,
    MultiAZ=True,
    EngineVersion='3.5.1'
)

# Create source endpoint (on-premises Oracle)
source = dms.create_endpoint(
    EndpointIdentifier='onprem-oracle',
    EndpointType='source',
    EngineName='oracle',
    ServerName='onprem-db.company.com',
    Port=1521,
    DatabaseName='PROD',
    Username='dms_user',
    Password='SecurePassword',
    SslMode='require'
)

# Create target endpoint (Aurora PostgreSQL)
target = dms.create_endpoint(
    EndpointIdentifier='aurora-postgres',
    EndpointType='target',
    EngineName='postgres',
    ServerName='aurora.cluster-123.us-east-1.rds.amazonaws.com',
    Port=5432,
    DatabaseName='production',
    Username='dms_user',
    Password='SecurePassword'
)

# Create full load + CDC task
task = dms.create_replication_task(
    ReplicationTaskIdentifier='oracle-to-aurora',
    SourceEndpointArn=source['Endpoint']['EndpointArn'],
    TargetEndpointArn=target['Endpoint']['EndpointArn'],
    ReplicationInstanceArn=response['ReplicationInstance']['ReplicationInstanceArn'],
    MigrationType='full-load-and-cdc',
    TableMappings='''{
        "rules": [{
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "include-all",
            "object-locator": {
                "schema-name": "PROD",
                "table-name": "%"
            },
            "rule-action": "include"
        }]
    }'''
)

Snowball Usage

import boto3

snowball = boto3.client('snowball')

# Create Snowball job
response = snowball.create_job(
    JobType='IMPORT',
    Resources={
        'LambdaResources': [],
        'S3Resources': [
            {
                'BucketArn': 'arn:aws:s3:::migration-bucket',
                'KeyRange': {
                    'BeginMarker': 'data/',
                    'EndMarker': 'data/z'
                }
            }
        ]
    },
    Address={
        'Name': 'Data Center',
        'AddressLine1': '123 Tech St',
        'City': 'Seattle',
        'State': 'WA',
        'PostalCode': '98101',
        'Country': 'US'
    },
    ShippingOption='EXPEDITED',
    Notification={
        'SnsTopicARN': 'arn:aws:sns:us-east-1:123456789012:migration-alerts',
        'EventTypes': ['JobCompleted', 'JobFailed']
    }
)

Interview Q&A

Q1: When to use DMS vs Snowball?

Answer: DMS for database migrations (<10 TB) with ongoing replication. Snowball for massive datasets (>10 TB) where network transfer is impractical.

Q2: What is DMS CDC?

Answer: Change Data Capture captures ongoing changes from source databases using transaction logs (binlog, WAL), enabling continuous replication.

Q3: How does Transfer Family work?

Answer: Transfer Family provides managed SFTP/FTPS endpoints backed by S3. Users connect via standard protocols; files land in S3.

Summary

  • DMS: Database migration with CDC support
  • Snowball: Physical device for large-scale data transfer
  • Transfer Family: Managed SFTP/FTPS backed by S3
  • DataSync: Automated file synchronization
  • Strategy: Choose based on data size and transfer requirements

Advertisement