πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Data Security: CMEK, VPC-SC & Audit Logs

GCP Data EngineeringData Security⭐ Premium

Advertisement

Data Security on GCP

Master data security on GCP including CMEK, VPC Service Controls, audit logs, encryption, and security best practices.

18 min readAdvanced

Security Architecture

πŸ›‘οΈ GCP Security Architecture for Data Engineering
GCP Security: Defense-in-Depth for Data EngineeringENCRYPTIONAt RestAES-256, auto-rotated by defaultIn TransitTLS 1.2+, internal Google networkCMEKCustomer-managed encryption keysCSEKCustomer-supplied encryption keysVPC SERVICE CONTROLSService PerimetersBoundaries around GCP servicesAccess LevelsIP, device, identity conditionsDry-run ModeTest before enforcementBridge PerimetersCross-perimeter accessCloud Armorβ€’ DDoS Protectionβ€’ WAF Rules (OWASP)β€’ IP Allow/Deny Listsβ€’ Adaptive ProtectionCloud DLP APIβ€’ Data Classificationβ€’ Sensitive Data Detectionβ€’ De-identificationβ€’ InfoType TemplatesCloud KMSβ€’ Key Rings & Keysβ€’ HSM / External KMSβ€’ Key Rotationβ€’ IAM for KeysSHARED RESPONSIBILITY MODELGoogle Manages: Physical security, network, hypervisorYou Manage: Data, IAM, configs, application codeDATA ENGINEERING SECURITY CHECKLISTβœ“ Enable CMEK for BigQueryβœ“ VPC-SC for data projectsβœ“ DLP for sensitive dataβœ“ Audit logs enabled
Interview Tip: GCP follows a shared responsibility model β€” Google secures the infrastructure, you secure your data. Enable encryption at rest (default), use CMEK for sensitive data, implement VPC Service Controls for data exfiltration prevention, and use Cloud DLP to classify and protect PII.

Customer-Managed Encryption Keys (CMEK)

from google.cloud import kms_v1

client = kms_v1.KeyManagementServiceClient()

# Create key ring
key_ring = client.create_key_ring(
    request={
        "parent": "projects/my-project/locations/us-central1",
        "key_ring_id": "data-engineering-keyring",
    }
)

# Create encryption key
key = client.create_crypto_key(
    request={
        "parent": key_ring.name,
        "crypto_key_id": "bigquery-encryption-key",
        "purpose": "ENCRYPT_DECRYPT",
        "version_template": {
            "algorithm": "GOOGLE_SYMMETRIC_ENCRYPTION",
            "protection_level": kms_v1.CryptoKeyVersion.ProtectionLevel.HSM
        }
    }
)

# Enable automatic rotation
key.rotation_period = {"seconds": 7776000}  # 90 days
client.update_crypto_key(request={"crypto_key": key})

VPC Service Controls

from google.cloud import accesscontextmanager_v1

client = accesscontextmanager_v1.AccessContextManagerClient()

# Create access policy
policy = client.create_access_policy(
    request={
        "parent": "organizations/123456789",
        "title": "Data Engineering Access Policy",
        "scopes": ["projects/my-project"]
    }
)

# Create service perimeter
perimeter = client.create_service_perimeter(
    request={
        "parent": policy.name,
        "service_perimeter_id": "data-engineering-perimeter",
        "service_perimeter": {
            "title": "Data Engineering Perimeter",
            "status": {
                "resources": ["projects/my-project"],
                "restricted_services": [
                    "bigquery.googleapis.com",
                    "storage.googleapis.com",
                    "dataflow.googleapis.com"
                ],
                "vpc_accessible_services": {
                    "enable_restriction": True,
                    "allowed_services": ["bigquery.googleapis.com"]
                }
            }
        }
    }
)

✨

Best Practice: Use CMEK for all sensitive data in BigQuery and GCS. Implement VPC Service Controls to prevent data exfiltration. Enable audit logging for data access. Use Workload Identity Federation instead of service account keys. Review security configurations quarterly.

πŸ’¬

Common Interview Questions

Q1: What is the difference between Google-managed and CMEK?

Answer: Google-managed keys are automatically rotated and managed by Google. CMEK (Customer-Managed Encryption Keys) are managed by the customer via Cloud KMS, providing control over key rotation, destruction, and access policies. CMEK is required for compliance with certain regulations.

Q2: What are VPC Service Controls?

Answer: VPC Service Controls create security perimeters around GCP resources to prevent data exfiltration. They restrict which services can be accessed from within the perimeter and control egress to external services. Essential for protecting sensitive data in data lakes.

Q3: What types of audit logs should be enabled?

Answer: 1) Admin Activity logs (always on), 2) Data Access logs (BigQuery, GCS), 3) System Event logs, 4) Policy Denied logs. Data Access logs are critical for compliance but incur costs. Enable them for sensitive datasets.

Q4: How do you secure a data pipeline?

Answer: 1) Use service accounts with minimal permissions, 2) Enable CMEK for encryption, 3) Implement VPC Service Controls, 4) Use Private Google Access, 5) Enable audit logging, 6) Implement data masking for non-production, 7) Monitor for anomalies.

Q5: What is Workload Identity Federation?

Answer: Workload Identity Federation allows external identity providers (AWS, Azure, GitHub) to access GCP resources using OIDC tokens instead of service account keys. It eliminates long-lived credentials and is recommended for CI/CD and multi-cloud scenarios.

Advertisement