πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Infrastructure as Code for Data Platforms

Data EngineeringDevOps⭐ Premium

Advertisement

HashiCorp & Netflix Interview

Infrastructure as Code for Data Platforms

Automating data platform provisioning and management

Interview Question

"Design an Infrastructure as Code solution for a data platform that: (1) provisions Snowflake, Kafka, and Spark clusters, (2) manages environment separation (dev/staging/prod), (3) implements RBAC, (4) handles secrets management, (5) includes CI/CD pipeline. Include Terraform code and best practices."

Difficulty: Hard | Frequently asked at HashiCorp, Netflix, Uber, Datadog


Theoretical Foundation

What is Infrastructure as Code (IaC)?

IaC is the practice of managing infrastructure through code rather than manual processes.

Architecture Diagram
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Infrastructure as Code                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                             β”‚
β”‚  Traditional:                                               β”‚
β”‚  - Manual provisioning                                      β”‚
β”‚  - Configuration drift                                      β”‚
β”‚  - Inconsistent environments                               β”‚
β”‚  - Slow disaster recovery                                  β”‚
β”‚                                                             β”‚
β”‚  IaC:                                                       β”‚
β”‚  - Automated provisioning                                   β”‚
β”‚  - Version controlled                                       β”‚
β”‚  - Consistent environments                                 β”‚
β”‚  - Fast disaster recovery                                  β”‚
β”‚                                                             β”‚
β”‚  IaC Tools:                                                 β”‚
β”‚  - Terraform (HashiCorp)                                   β”‚
β”‚  - CloudFormation (AWS)                                    β”‚
β”‚  - Pulumi (Multi-cloud)                                    β”‚
β”‚  - Ansible (Configuration management)                      β”‚
β”‚                                                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Terraform Architecture

Terraform ArchitectureTerraform Configuration Files (.tf)main.tf | variables.tf | outputs.tf | providers.tf↓Terraform Plan β€” Show what will be created/modified/deleted↓Terraform Apply β€” Create/modify/delete resources↓State File (terraform.tfstate) β€” Store remotely (S3, GCS)

Terraform State Management

Architecture Diagram
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              State Management                               β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                             β”‚
β”‚  Local State:                                               β”‚
β”‚  - Stored on local machine                                 β”‚
β”‚  - Not shared                                              β”‚
β”‚  - Risk of loss                                            β”‚
β”‚                                                             β”‚
β”‚  Remote State:                                              β”‚
β”‚  - Stored in cloud storage (S3, GCS)                       β”‚
β”‚  - Shared across team                                      β”‚
β”‚  - State locking (DynamoDB, GCS)                           β”‚
β”‚  - Versioning                                              β”‚
β”‚                                                             β”‚
β”‚  State Locking:                                             β”‚
β”‚  - Prevents concurrent modifications                       β”‚
β”‚  - Uses DynamoDB (AWS) or GCS (GCP)                        β”‚
β”‚                                                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Environment Separation

Architecture Diagram
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Environment Separation                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                             β”‚
β”‚  Approach 1: Separate State Files                          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  dev/                                               β”‚   β”‚
β”‚  β”‚  β”œβ”€β”€ main.tf                                        β”‚   β”‚
β”‚  β”‚  β”œβ”€β”€ variables.tf                                   β”‚   β”‚
β”‚  β”‚  └── terraform.tfstate                              β”‚   β”‚
β”‚  β”‚  staging/                                           β”‚   β”‚
β”‚  β”‚  β”œβ”€β”€ main.tf                                        β”‚   β”‚
β”‚  β”‚  β”œβ”€β”€ variables.tf                                   β”‚   β”‚
β”‚  β”‚  └── terraform.tfstate                              β”‚   β”‚
β”‚  β”‚  prod/                                              β”‚   β”‚
β”‚  β”‚  β”œβ”€β”€ main.tf                                        β”‚   β”‚
β”‚  β”‚  β”œβ”€β”€ variables.tf                                   β”‚   β”‚
β”‚  β”‚  └── terraform.tfstate                              β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                                             β”‚
β”‚  Approach 2: Workspaces                                    β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  terraform workspace new dev                        β”‚   β”‚
β”‚  β”‚  terraform workspace new staging                    β”‚   β”‚
β”‚  β”‚  terraform workspace new prod                       β”‚   β”‚
β”‚  β”‚                                                     β”‚   β”‚
β”‚  β”‚  terraform workspace select dev                     β”‚   β”‚
β”‚  β”‚  terraform apply                                    β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Code Implementation

Terraform Project Structure

Architecture Diagram
data-platform/
β”œβ”€β”€ environments/
β”‚   β”œβ”€β”€ dev/
β”‚   β”‚   β”œβ”€β”€ main.tf
β”‚   β”‚   β”œβ”€β”€ variables.tf
β”‚   β”‚   β”œβ”€β”€ outputs.tf
β”‚   β”‚   └── terraform.tfvars
β”‚   β”œβ”€β”€ staging/
β”‚   β”‚   β”œβ”€β”€ main.tf
β”‚   β”‚   β”œβ”€β”€ variables.tf
β”‚   β”‚   β”œβ”€β”€ outputs.tf
β”‚   β”‚   └── terraform.tfvars
β”‚   └── prod/
β”‚       β”œβ”€β”€ main.tf
β”‚       β”œβ”€β”€ variables.tf
β”‚       β”œβ”€β”€ outputs.tf
β”‚       └── terraform.tfvars
β”œβ”€β”€ modules/
β”‚   β”œβ”€β”€ snowflake/
β”‚   β”‚   β”œβ”€β”€ main.tf
β”‚   β”‚   β”œβ”€β”€ variables.tf
β”‚   β”‚   └── outputs.tf
β”‚   β”œβ”€β”€ kafka/
β”‚   β”‚   β”œβ”€β”€ main.tf
β”‚   β”‚   β”œβ”€β”€ variables.tf
β”‚   β”‚   └── outputs.tf
β”‚   β”œβ”€β”€ spark/
β”‚   β”‚   β”œβ”€β”€ main.tf
β”‚   β”‚   β”œβ”€β”€ variables.tf
β”‚   β”‚   └── outputs.tf
β”‚   └── networking/
β”‚       β”œβ”€β”€ main.tf
β”‚       β”œβ”€β”€ variables.tf
β”‚       └── outputs.tf
β”œβ”€β”€ .gitignore
└── README.md

Provider Configuration

# environments/prod/providers.tf
terraform {
  required_version = ">= 1.0"
  
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
    snowflake = {
      source  = "Snowflake-Labs/snowflake"
      version = "~> 0.89"
    }
    kafka = {
      source  = "Mongey/kafka"
      version = "~> 0.6"
    }
  }
  
  # Remote state backend
  backend "s3" {
    bucket         = "company-terraform-state"
    key            = "data-platform/prod/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

# AWS Provider
provider "aws" {
  region = var.aws_region
  
  default_tags {
    tags = {
      Environment = var.environment
      Team        = "data-engineering"
      ManagedBy   = "terraform"
    }
  }
}

# Snowflake Provider
provider "snowflake" {
  role         = "ACCOUNTADMIN"
  account      = var.snowflake_account
  user         = var.snowflake_user
  authenticator = "SNOWFLAKE_JWT"
  private_key  = var.snowflake_private_key
}

Snowflake Module

# modules/snowflake/main.tf

# Snowflake Warehouse
resource "snowflake_warehouse" "analytics" {
  name           = "ANALYTICS_WH"
  comment        = "Analytics warehouse"
  warehouse_size = "medium"
  
  auto_suspend    = 60
  auto_resume     = true
  min_cluster_count = 1
  max_cluster_count = 5
  
  scaling_policy = "ECONOMY"
}

# Snowflake Database
resource "snowflake_database" "analytics" {
  name    = "ANALYTICS_DB"
  comment = "Analytics database"
}

# Snowflake Schema
resource "snowflake_schema" "staging" {
  database = snowflake_database.analytics.name
  name     = "STAGING"
  comment  = "Staging schema"
}

resource "snowflake_schema" "marts" {
  database = snowflake_database.analytics.name
  name     = "MARTS"
  comment  = "Marts schema"
}

# Snowflake Role
resource "snowflake_role" "analyst" {
  name    = "ANALYST"
  comment = "Analyst role"
}

# Grant privileges
resource "snowflake_grant_privileges_to_account_role" "analyst_warehouse" {
  account_role_name = snowflake_role.analyst.name
  privileges        = ["USAGE"]
  on_account_object {
    object_type = "WAREHOUSE"
    object_name = snowflake_warehouse.analytics.name
  }
}

resource "snowflake_grant_privileges_to_account_role" "analyst_database" {
  account_role_name = snowflake_role.analyst.name
  privileges        = ["USAGE"]
  on_account_object {
    object_type = "DATABASE"
    object_name = snowflake_database.analytics.name
  }
}

resource "snowflake_grant_privileges_to_account_role" "analyst_schemas" {
  account_role_name = snowflake_role.analyst.name
  privileges        = ["USAGE"]
  on_schema {
    schema_name = "${snowflake_database.analytics.name}.${snowflake_schema.staging.name}"
  }
}

resource "snowflake_grant_privileges_to_account_role" "analyst_tables" {
  account_role_name = snowflake_role.analyst.name
  privileges        = ["SELECT"]
  on_schema_object {
    object_type = "TABLE"
    object_name = "${snowflake_database.analytics.name}.${snowflake_schema.staging.name}.*"
  }
}

Kafka Module

# modules/kafka/main.tf

# MSK Cluster
resource "aws_msk_cluster" "kafka" {
  cluster_name           = "data-platform-kafka"
  kafka_version          = "3.5.1"
  number_of_broker_nodes = 3
  
  broker_node_group_info {
    instance_type   = "kafka.m5.large"
    client_subnets  = var.subnet_ids
    security_groups = [aws_security_group.kafka.id]
    
    storage_info {
      ebs_storage_info {
        volume_size = 1000
      }
    }
  }
  
  encryption_info {
    encryption_in_transit {
      client_broker = "TLS"
      in_cluster    = true
    }
    encryption_at_rest_kms_key_arn = aws_kms_key.kafka.arn
  }
  
  configuration_info {
    arn      = aws_msk_configuration.kafka.arn
    revision = aws_msk_configuration.kafka.latest_revision
  }
  
  tags = {
    Name = "data-platform-kafka"
  }
}

# MSK Configuration
resource "aws_msk_configuration" "kafka" {
  name              = "data-platform-config"
  kafka_versions    = ["3.5.1"]
  
  server_properties = <<PROPERTIES
auto.create.topics.enable=true
delete.topic.enable=true
num.partitions=100
default.replication.factor=3
min.insync.replicas=2
log.retention.hours=168
PROPERTIES
}

# Security Group
resource "aws_security_group" "kafka" {
  name        = "kafka-security-group"
  description = "Security group for Kafka cluster"
  
  ingress {
    from_port   = 9092
    to_port     = 9092
    protocol    = "tcp"
    cidr_blocks = [var.vpc_cidr]
  }
  
  ingress {
    from_port   = 9094
    to_port     = 9094
    protocol    = "tcp"
    cidr_blocks = [var.vpc_cidr]
  }
}

# KMS Key for encryption
resource "aws_kms_key" "kafka" {
  description = "KMS key for Kafka encryption"
}

Spark Module

# modules/spark/main.tf

# EMR Cluster
resource "aws_emr_cluster" "spark" {
  name          = "data-platform-spark"
  release_label = "emr-6.15.0"
  applications  = ["Spark", "Hive", "JupyterEnterpriseGateway"]
  
  service_role = aws_iam_role.emr_service.arn
  
  master_instance_group {
    instance_type  = "m5.xlarge"
    instance_count = 1
  }
  
  core_instance_group {
    instance_type  = "m5.2xlarge"
    instance_count = 3
    
    autoscaling_policy {
      constraints {
        min_capacity = 3
        max_capacity = 10
      }
      
      rule {
        metric_type        = "YARN_AVAILABLE_MEMORY_PERCENTAGE"
        comparison_operator = "LESS_THAN"
        scaling_adjustment  = 1
        cool_down_duration  = 300
      }
    }
  }
  
  ec2_attributes {
    key_name                          = var.ssh_key_name
    subnet_id                         = var.subnet_id
    emr_managed_master_security_group = aws_security_group.emr_master.id
    emr_managed_slave_security_group  = aws_security_group.emr_slave.id
  }
  
  tags = {
    Name = "data-platform-spark"
  }
}

# S3 Bucket for Spark logs
resource "aws_s3_bucket" "spark_logs" {
  bucket = "data-platform-spark-logs-${var.environment}"
}

# IAM Role for EMR
resource "aws_iam_role" "emr_service" {
  name = "emr-service-role"
  
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "emr.amazonaws.com"
        }
      }
    ]
  })
}

Secrets Management

# modules/secrets/main.tf

# AWS Secrets Manager
resource "aws_secretsmanager_secret" "snowflake" {
  name = "data-platform/snowflake"
}

resource "aws_secretsmanager_secret_version" "snowflake" {
  secret_id = aws_secretsmanager_secret.snowflake.id
  secret_string = jsonencode({
    account   = var.snowflake_account
    user     = var.snowflake_user
    password = var.snowflake_password
  })
}

# HashiCorp Vault (alternative)
# resource "vault_generic_secret" "snowflake" {
#   path = "secret/data-platform/snowflake"
#   
#   data_json = jsonencode({
#     account   = var.snowflake_account
#     user     = var.snowflake_user
#     password = var.snowflake_password
#   })
# }

CI/CD Pipeline

# .github/workflows/terraform.yml
name: Terraform

on:
  push:
    branches:
      - main
  pull_request:
    branches:
      - main

env:
  TF_VERSION: "1.6.0"

jobs:
  terraform:
    runs-on: ubuntu-latest
    
    steps:
      - name: Checkout code
        uses: actions/checkout@v3
      
      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v2
        with:
          terraform_version: ${{ env.TF_VERSION }}
      
      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v2
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: us-east-1
      
      - name: Terraform Init
        run: terraform init
        working-directory: environments/${{ github.event.inputs.environment }}
      
      - name: Terraform Plan
        run: terraform plan -out=tfplan
        working-directory: environments/${{ github.event.inputs.environment }}
      
      - name: Terraform Apply
        if: github.ref == 'refs/heads/main' && github.event_name == 'push'
        run: terraform apply -auto-approve tfplan
        working-directory: environments/${{ github.event.inputs.environment }}

Usage

# Initialize Terraform
terraform init

# Plan changes
terraform plan -var-file="prod.tfvars"

# Apply changes
terraform apply -var-file="prod.tfvars"

# Destroy infrastructure
terraform destroy -var-file="prod.tfvars"

πŸ’‘

Production Tip: Always use remote state with state locking. Store state in encrypted S3 with versioning. Use separate state files for each environment. Never commit state files to Git.


Common Follow-Up Questions

Q1: How do you handle secrets in Terraform?

# Use variables for secrets
variable "snowflake_password" {
  type      = string
  sensitive = true
}

# Use AWS Secrets Manager
data "aws_secretsmanager_secret_version" "snowflake" {
  secret_id = "data-platform/snowflake"
}

# Use environment variables
export TF_VAR_snowflake_password="mysecretpassword"

Q2: How do you manage Terraform state?

# Remote state with S3 backend
terraform {
  backend "s3" {
    bucket         = "company-terraform-state"
    key            = "data-platform/prod/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

Q3: How do you handle Terraform modules?

# Use modules for reusable components
module "snowflake" {
  source = "../../modules/snowflake"
  
  environment = var.environment
  account     = var.snowflake_account
}

Q4: How do you implement Terraform testing?

# Use Terratest for integration tests
func TestTerraform(t *testing.T) {
    terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
        TerraformDir: "../examples/simple",
    })
    
    defer terraform.Destroy(t, terraformOptions)
    terraform.InitAndApply(t, terraformOptions)
    
    // Assertions
    output := terraform.Output(t, terraformOptions, "snowflake_warehouse")
    assert.Equal(t, "ANALYTICS_WH", output)
}

⚠️

Critical Consideration: Never commit secrets to Git. Use environment variables, AWS Secrets Manager, or HashiCorp Vault. Always encrypt state files and use state locking.


Company-Specific Tips

HashiCorp Interview Tips

  • Discuss Terraform best practices
  • Explain state management strategies
  • Mention modules and workspaces
  • Talk about Terraform Cloud features

Netflix Interview Tips

  • Focus on multi-cloud Terraform
  • Explain environment separation strategies
  • Mention secrets management at scale
  • Talk about CI/CD for infrastructure

Uber Interview Tips

  • Discuss Terraform for Kubernetes
  • Explain Helm charts for applications
  • Mention ArgoCD for GitOps
  • Talk about infrastructure testing

ℹ️

Final Takeaway: Infrastructure as Code is essential for managing modern data platforms. Use Terraform for provisioning, modules for reusability, and remote state for collaboration. Always implement proper secrets management, testing, and CI/CD.

Advertisement