HashiCorp & Netflix Interview

Infrastructure as Code for Data Platforms

Automating data platform provisioning and management

Interview Question

"Design an Infrastructure as Code solution for a data platform that: (1) provisions Snowflake, Kafka, and Spark clusters, (2) manages environment separation (dev/staging/prod), (3) implements RBAC, (4) handles secrets management, (5) includes CI/CD pipeline. Include Terraform code and best practices."

Difficulty: Hard | Frequently asked at HashiCorp, Netflix, Uber, Datadog

Theoretical Foundation

What is Infrastructure as Code (IaC)?

IaC is the practice of managing infrastructure through code rather than manual processes.

Architecture Diagram

┌─────────────────────────────────────────────────────────────┐
│              Infrastructure as Code                         │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  Traditional:                                               │
│  - Manual provisioning                                      │
│  - Configuration drift                                      │
│  - Inconsistent environments                               │
│  - Slow disaster recovery                                  │
│                                                             │
│  IaC:                                                       │
│  - Automated provisioning                                   │
│  - Version controlled                                       │
│  - Consistent environments                                 │
│  - Fast disaster recovery                                  │
│                                                             │
│  IaC Tools:                                                 │
│  - Terraform (HashiCorp)                                   │
│  - CloudFormation (AWS)                                    │
│  - Pulumi (Multi-cloud)                                    │
│  - Ansible (Configuration management)                      │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Terraform Architecture

Terraform State Management

Architecture Diagram

┌─────────────────────────────────────────────────────────────┐
│              State Management                               │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  Local State:                                               │
│  - Stored on local machine                                 │
│  - Not shared                                              │
│  - Risk of loss                                            │
│                                                             │
│  Remote State:                                              │
│  - Stored in cloud storage (S3, GCS)                       │
│  - Shared across team                                      │
│  - State locking (DynamoDB, GCS)                           │
│  - Versioning                                              │
│                                                             │
│  State Locking:                                             │
│  - Prevents concurrent modifications                       │
│  - Uses DynamoDB (AWS) or GCS (GCP)                        │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Environment Separation

Architecture Diagram

┌─────────────────────────────────────────────────────────────┐
│              Environment Separation                         │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  Approach 1: Separate State Files                          │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  dev/                                               │   │
│  │  ├── main.tf                                        │   │
│  │  ├── variables.tf                                   │   │
│  │  └── terraform.tfstate                              │   │
│  │  staging/                                           │   │
│  │  ├── main.tf                                        │   │
│  │  ├── variables.tf                                   │   │
│  │  └── terraform.tfstate                              │   │
│  │  prod/                                              │   │
│  │  ├── main.tf                                        │   │
│  │  ├── variables.tf                                   │   │
│  │  └── terraform.tfstate                              │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
│  Approach 2: Workspaces                                    │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  terraform workspace new dev                        │   │
│  │  terraform workspace new staging                    │   │
│  │  terraform workspace new prod                       │   │
│  │                                                     │   │
│  │  terraform workspace select dev                     │   │
│  │  terraform apply                                    │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Code Implementation

Terraform Project Structure

Architecture Diagram

data-platform/
├── environments/
│   ├── dev/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── outputs.tf
│   │   └── terraform.tfvars
│   ├── staging/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── outputs.tf
│   │   └── terraform.tfvars
│   └── prod/
│       ├── main.tf
│       ├── variables.tf
│       ├── outputs.tf
│       └── terraform.tfvars
├── modules/
│   ├── snowflake/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── kafka/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── spark/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   └── networking/
│       ├── main.tf
│       ├── variables.tf
│       └── outputs.tf
├── .gitignore
└── README.md

Provider Configuration

# environments/prod/providers.tf
terraform {
  required_version = ">= 1.0"
  
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
    snowflake = {
      source  = "Snowflake-Labs/snowflake"
      version = "~> 0.89"
    }
    kafka = {
      source  = "Mongey/kafka"
      version = "~> 0.6"
    }
  }
  
  # Remote state backend
  backend "s3" {
    bucket         = "company-terraform-state"
    key            = "data-platform/prod/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

# AWS Provider
provider "aws" {
  region = var.aws_region
  
  default_tags {
    tags = {
      Environment = var.environment
      Team        = "data-engineering"
      ManagedBy   = "terraform"
    }
  }
}

# Snowflake Provider
provider "snowflake" {
  role         = "ACCOUNTADMIN"
  account      = var.snowflake_account
  user         = var.snowflake_user
  authenticator = "SNOWFLAKE_JWT"
  private_key  = var.snowflake_private_key
}

Snowflake Module

# modules/snowflake/main.tf

# Snowflake Warehouse
resource "snowflake_warehouse" "analytics" {
  name           = "ANALYTICS_WH"
  comment        = "Analytics warehouse"
  warehouse_size = "medium"
  
  auto_suspend    = 60
  auto_resume     = true
  min_cluster_count = 1
  max_cluster_count = 5
  
  scaling_policy = "ECONOMY"
}

# Snowflake Database
resource "snowflake_database" "analytics" {
  name    = "ANALYTICS_DB"
  comment = "Analytics database"
}

# Snowflake Schema
resource "snowflake_schema" "staging" {
  database = snowflake_database.analytics.name
  name     = "STAGING"
  comment  = "Staging schema"
}

resource "snowflake_schema" "marts" {
  database = snowflake_database.analytics.name
  name     = "MARTS"
  comment  = "Marts schema"
}

# Snowflake Role
resource "snowflake_role" "analyst" {
  name    = "ANALYST"
  comment = "Analyst role"
}

# Grant privileges
resource "snowflake_grant_privileges_to_account_role" "analyst_warehouse" {
  account_role_name = snowflake_role.analyst.name
  privileges        = ["USAGE"]
  on_account_object {
    object_type = "WAREHOUSE"
    object_name = snowflake_warehouse.analytics.name
  }
}

resource "snowflake_grant_privileges_to_account_role" "analyst_database" {
  account_role_name = snowflake_role.analyst.name
  privileges        = ["USAGE"]
  on_account_object {
    object_type = "DATABASE"
    object_name = snowflake_database.analytics.name
  }
}

resource "snowflake_grant_privileges_to_account_role" "analyst_schemas" {
  account_role_name = snowflake_role.analyst.name
  privileges        = ["USAGE"]
  on_schema {
    schema_name = "${snowflake_database.analytics.name}.${snowflake_schema.staging.name}"
  }
}

resource "snowflake_grant_privileges_to_account_role" "analyst_tables" {
  account_role_name = snowflake_role.analyst.name
  privileges        = ["SELECT"]
  on_schema_object {
    object_type = "TABLE"
    object_name = "${snowflake_database.analytics.name}.${snowflake_schema.staging.name}.*"
  }
}

Kafka Module

# modules/kafka/main.tf

# MSK Cluster
resource "aws_msk_cluster" "kafka" {
  cluster_name           = "data-platform-kafka"
  kafka_version          = "3.5.1"
  number_of_broker_nodes = 3
  
  broker_node_group_info {
    instance_type   = "kafka.m5.large"
    client_subnets  = var.subnet_ids
    security_groups = [aws_security_group.kafka.id]
    
    storage_info {
      ebs_storage_info {
        volume_size = 1000
      }
    }
  }
  
  encryption_info {
    encryption_in_transit {
      client_broker = "TLS"
      in_cluster    = true
    }
    encryption_at_rest_kms_key_arn = aws_kms_key.kafka.arn
  }
  
  configuration_info {
    arn      = aws_msk_configuration.kafka.arn
    revision = aws_msk_configuration.kafka.latest_revision
  }
  
  tags = {
    Name = "data-platform-kafka"
  }
}

# MSK Configuration
resource "aws_msk_configuration" "kafka" {
  name              = "data-platform-config"
  kafka_versions    = ["3.5.1"]
  
  server_properties = <<PROPERTIES
auto.create.topics.enable=true
delete.topic.enable=true
num.partitions=100
default.replication.factor=3
min.insync.replicas=2
log.retention.hours=168
PROPERTIES
}

# Security Group
resource "aws_security_group" "kafka" {
  name        = "kafka-security-group"
  description = "Security group for Kafka cluster"
  
  ingress {
    from_port   = 9092
    to_port     = 9092
    protocol    = "tcp"
    cidr_blocks = [var.vpc_cidr]
  }
  
  ingress {
    from_port   = 9094
    to_port     = 9094
    protocol    = "tcp"
    cidr_blocks = [var.vpc_cidr]
  }
}

# KMS Key for encryption
resource "aws_kms_key" "kafka" {
  description = "KMS key for Kafka encryption"
}

Spark Module

# modules/spark/main.tf

# EMR Cluster
resource "aws_emr_cluster" "spark" {
  name          = "data-platform-spark"
  release_label = "emr-6.15.0"
  applications  = ["Spark", "Hive", "JupyterEnterpriseGateway"]
  
  service_role = aws_iam_role.emr_service.arn
  
  master_instance_group {
    instance_type  = "m5.xlarge"
    instance_count = 1
  }
  
  core_instance_group {
    instance_type  = "m5.2xlarge"
    instance_count = 3
    
    autoscaling_policy {
      constraints {
        min_capacity = 3
        max_capacity = 10
      }
      
      rule {
        metric_type        = "YARN_AVAILABLE_MEMORY_PERCENTAGE"
        comparison_operator = "LESS_THAN"
        scaling_adjustment  = 1
        cool_down_duration  = 300
      }
    }
  }
  
  ec2_attributes {
    key_name                          = var.ssh_key_name
    subnet_id                         = var.subnet_id
    emr_managed_master_security_group = aws_security_group.emr_master.id
    emr_managed_slave_security_group  = aws_security_group.emr_slave.id
  }
  
  tags = {
    Name = "data-platform-spark"
  }
}

# S3 Bucket for Spark logs
resource "aws_s3_bucket" "spark_logs" {
  bucket = "data-platform-spark-logs-${var.environment}"
}

# IAM Role for EMR
resource "aws_iam_role" "emr_service" {
  name = "emr-service-role"
  
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "emr.amazonaws.com"
        }
      }
    ]
  })
}

Secrets Management

# modules/secrets/main.tf

# AWS Secrets Manager
resource "aws_secretsmanager_secret" "snowflake" {
  name = "data-platform/snowflake"
}

resource "aws_secretsmanager_secret_version" "snowflake" {
  secret_id = aws_secretsmanager_secret.snowflake.id
  secret_string = jsonencode({
    account   = var.snowflake_account
    user     = var.snowflake_user
    password = var.snowflake_password
  })
}

# HashiCorp Vault (alternative)
# resource "vault_generic_secret" "snowflake" {
#   path = "secret/data-platform/snowflake"
#   
#   data_json = jsonencode({
#     account   = var.snowflake_account
#     user     = var.snowflake_user
#     password = var.snowflake_password
#   })
# }

CI/CD Pipeline

# .github/workflows/terraform.yml
name: Terraform

on:
  push:
    branches:
      - main
  pull_request:
    branches:
      - main

env:
  TF_VERSION: "1.6.0"

jobs:
  terraform:
    runs-on: ubuntu-latest
    
    steps:
      - name: Checkout code
        uses: actions/checkout@v3
      
      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v2
        with:
          terraform_version: ${{ env.TF_VERSION }}
      
      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v2
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: us-east-1
      
      - name: Terraform Init
        run: terraform init
        working-directory: environments/${{ github.event.inputs.environment }}
      
      - name: Terraform Plan
        run: terraform plan -out=tfplan
        working-directory: environments/${{ github.event.inputs.environment }}
      
      - name: Terraform Apply
        if: github.ref == 'refs/heads/main' && github.event_name == 'push'
        run: terraform apply -auto-approve tfplan
        working-directory: environments/${{ github.event.inputs.environment }}

Usage

# Initialize Terraform
terraform init

# Plan changes
terraform plan -var-file="prod.tfvars"

# Apply changes
terraform apply -var-file="prod.tfvars"

# Destroy infrastructure
terraform destroy -var-file="prod.tfvars"

💡

Production Tip: Always use remote state with state locking. Store state in encrypted S3 with versioning. Use separate state files for each environment. Never commit state files to Git.

Common Follow-Up Questions

Q1: How do you handle secrets in Terraform?

# Use variables for secrets
variable "snowflake_password" {
  type      = string
  sensitive = true
}

# Use AWS Secrets Manager
data "aws_secretsmanager_secret_version" "snowflake" {
  secret_id = "data-platform/snowflake"
}

# Use environment variables
export TF_VAR_snowflake_password="mysecretpassword"

Q2: How do you manage Terraform state?

# Remote state with S3 backend
terraform {
  backend "s3" {
    bucket         = "company-terraform-state"
    key            = "data-platform/prod/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

Q3: How do you handle Terraform modules?

# Use modules for reusable components
module "snowflake" {
  source = "../../modules/snowflake"
  
  environment = var.environment
  account     = var.snowflake_account
}

Q4: How do you implement Terraform testing?

# Use Terratest for integration tests
func TestTerraform(t *testing.T) {
    terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
        TerraformDir: "../examples/simple",
    })
    
    defer terraform.Destroy(t, terraformOptions)
    terraform.InitAndApply(t, terraformOptions)
    
    // Assertions
    output := terraform.Output(t, terraformOptions, "snowflake_warehouse")
    assert.Equal(t, "ANALYTICS_WH", output)
}

⚠️

Critical Consideration: Never commit secrets to Git. Use environment variables, AWS Secrets Manager, or HashiCorp Vault. Always encrypt state files and use state locking.

Company-Specific Tips

HashiCorp Interview Tips

Discuss Terraform best practices
Explain state management strategies
Mention modules and workspaces
Talk about Terraform Cloud features

Netflix Interview Tips

Focus on multi-cloud Terraform
Explain environment separation strategies
Mention secrets management at scale
Talk about CI/CD for infrastructure

Uber Interview Tips

Discuss Terraform for Kubernetes
Explain Helm charts for applications
Mention ArgoCD for GitOps
Talk about infrastructure testing

ℹ️

Final Takeaway: Infrastructure as Code is essential for managing modern data platforms. Use Terraform for provisioning, modules for reusability, and remote state for collaboration. Always implement proper secrets management, testing, and CI/CD.