πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Backup, Geo-Redundancy & Disaster Recovery

Azure Data EngineeringBackup & Recovery⭐ Premium

Advertisement

Backup, Geo-Redundancy & Disaster Recovery

Business continuity with backup strategies, geo-redundancy, and disaster recovery for data engineering

DR Architecture

Architecture Diagram
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    DISASTER RECOVERY ARCHITECTURE                    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                     β”‚
β”‚  PRIMARY REGION: EAST US 2       SECONDARY REGION: WEST US 2       β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”‚
β”‚  β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚        β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚         β”‚
β”‚  β”‚ β”‚ ADLS Gen2        β”‚ β”‚ Geo-Repβ”‚ β”‚ ADLS Gen2        β”‚ β”‚         β”‚
β”‚  β”‚ β”‚ (RA-GRS)         │─┼───────>β”‚ β”‚ (Secondary)      β”‚ β”‚         β”‚
β”‚  β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚        β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚         β”‚
β”‚  β”‚                      β”‚        β”‚                      β”‚         β”‚
β”‚  β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚        β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚         β”‚
β”‚  β”‚ β”‚ Synapse Pool     β”‚ β”‚ Geo-Repβ”‚ β”‚ Synapse Pool     β”‚ β”‚         β”‚
β”‚  β”‚ β”‚ (Active)         │─┼───────>β”‚ β”‚ (Standby)        β”‚ β”‚         β”‚
β”‚  β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚        β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚         β”‚
β”‚  β”‚                      β”‚        β”‚                      β”‚         β”‚
β”‚  β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚        β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚         β”‚
β”‚  β”‚ β”‚ Cosmos DB        β”‚ β”‚ Multi- β”‚ β”‚ Cosmos DB        β”‚ β”‚         β”‚
β”‚  β”‚ β”‚ (Multi-region    │─┼───────>β”‚ β”‚ (Replica)        β”‚ β”‚         β”‚
β”‚  β”‚ β”‚  writes)         β”‚ β”‚ region β”‚ β”‚                  β”‚ β”‚         β”‚
β”‚  β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚        β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚         β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β”‚
β”‚                                                                     β”‚
β”‚  RPO/RTO TARGETS:                                                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ Service          β”‚ RPO          β”‚ RTO           β”‚ SLA       β”‚   β”‚
β”‚  β”‚ ─────────────────────────────────────────────────────────── β”‚   β”‚
β”‚  β”‚ ADLS (RA-GRS)    β”‚ <15 min      β”‚ <30 min       β”‚ 99.99%   β”‚   β”‚
β”‚  β”‚ Synapse (Geo)    β”‚ <1 hour      β”‚ <4 hours      β”‚ 99.9%    β”‚   β”‚
β”‚  β”‚ Cosmos DB        β”‚ 0 (multi-    β”‚ 0 (automatic  β”‚ 99.999%  β”‚   β”‚
β”‚  β”‚                  β”‚  region)     β”‚  failover)    β”‚          β”‚   β”‚
β”‚  β”‚ Event Hubs       β”‚ 0 (capture)  β”‚ Minutes       β”‚ 99.95%   β”‚   β”‚
β”‚  β”‚ Databricks       β”‚ Varies       β”‚ Minutes-Hours β”‚ 99.9%    β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Backup Configuration

# Point-in-time restore for Synapse
# Azure CLI command
# az synapse sql pool restore \
#   --name SQLPool01 \
#   --workspace-name syn-prod-workspace \
#   --resource-group rg-dataengineering-prod \
#   --restore-point "$(date -d '2 hours ago' +%Y-%m-%dT%H:%M:%S)"

# Cosmos DB continuous backup
import requests
token = credential.get_token("https://management.azure.com/.default")

# Enable continuous backup
response = requests.patch(
    f"https://management.azure.com/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.DocumentDB/databaseAccounts/{account}",
    headers={"Authorization": f"Bearer {token.token}", "Content-Type": "application/json"},
    json={
        "properties": {
            "backupPolicy": {
                "type": "Continuous",
                "continuousModeProperties": {
                    "tier": "Continuous7Days"
                }
            }
        }
    }
)

Geo-Redundancy Configuration

resource storageAccount 'Microsoft.Storage/storageAccounts@2023-01-01' = {
  name: 'stdatalake001'
  location: location
  sku: {
    name: 'Standard_RAGRS'  // Read-Access Geo-Redundant
  }
  kind: 'StorageV2'
  properties: {
    isHnsEnabled: true
    replication: {
      geoReplication: {
        enabled: true
        destinationAccountName: 'stdatalake001-geo'
        destinationRegion: 'westus2'
      }
    }
  }
}

ℹ️

Pro Tip: Use RA-GRS for ADLS Gen2 to provide read access during regional outages. Use Cosmos DB multi-region writes for automatic failover with zero data loss.

Interview Questions

Q1: Explain the difference between RPO and RTO. A: RPO (Recovery Point Objective) is the maximum acceptable data loss (e.g., 15 minutes). RTO (Recovery Time Objective) is the maximum acceptable downtime (e.g., 1 hour). Both drive DR strategy design.

Q2: How do you test disaster recovery in Azure? A: 1) Simulate regional outage, 2) Initiate failover to secondary region, 3) Verify data integrity, 4) Test application functionality, 5) Measure actual RPO/RTO vs targets, 6) Document findings and improve.

Q3: What is the cost impact of geo-redundancy? A: Geo-redundancy doubles storage costs (primary + secondary). However, the cost of downtime (lost revenue, reputation) often far exceeds the additional storage cost. Use RA-GRS for critical data.

Advertisement