Key Vault: Secrets, Keys & Certificates for Data Pipelines
Secure credential management with Key Vault for Azure data engineering pipelines
Key Vault Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β KEY VAULT ARCHITECTURE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β VAULT TYPES β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Standard: Software-protected keys β β
β β Premium: HSM-protected keys (FIPS 140-2 Level 2) β β
β β Managed HSM: Dedicated HSM pool β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β OBJECT TYPES β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Secrets: Connection strings, passwords, tokens β β
β β Keys: Encryption keys (RSA, EC) β β
β β Certificates: TLS/SSL certificates β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ACCESS CONTROL β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Vault Access Policy: Legacy (per-vault) β β
β β RBAC: Recommended (per-resource) β β
β β β β
β β Roles: β β
β β β’ Key Vault Administrator: Full management β β
β β β’ Key Vault Secrets User: Read secrets β β
β β β’ Key Vault Crypto User: Use keys for encryption β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β INTEGRATION WITH DATA ENGINEERING β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β ADF: Linked Services reference Key Vault secrets β β
β β Synapse: Managed Identity retrieves secrets β β
β β Databricks: Secret Scope integrates with Key Vault β β
β β Functions: Key Vault references in app settings β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Python SDK for Key Vault
from azure.keyvault.secrets import SecretClient
from azure.keyvault.keys import KeyClient
from azure.identity import DefaultAzureCredential
credential = DefaultAzureCredential()
# Secrets Client
secret_client = SecretClient(
vault_url="https://kv-dataengineering.vault.azure.net/",
credential=credential
)
# Store secret
secret_client.set_secret("adls-connection-string", "DefaultEndpointsProtocol=https;...")
# Retrieve secret
secret = secret_client.get_secret("adls-connection-string")
print(f"Secret: {secret.value}")
# Rotate secret
from datetime import datetime
secret_client.set_secret(
"adls-connection-string",
"new-connection-string",
expires_on=datetime(2025, 12, 31)
)
# Keys Client
key_client = KeyClient(
vault_url="https://kv-dataengineering.vault.azure.net/",
credential=credential
)
# Create encryption key
key = key_client.create_key(
name="cmk-adls",
key_type="RSA-HSM",
size=2048,
key_operations=["encrypt", "decrypt"]
)
# Rotate key
key_client.create_key(
name="cmk-adls",
key_type="RSA-HSM",
size=2048
)
ADF Linked Service with Key Vault
{
"name": "ls_adls_keyvault",
"properties": {
"type": "AzureBlobFS",
"typeProperties": {
"url": "https://stdatalake001.dfs.core.windows.net",
"accountKey": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "akv_dataengineering",
"type": "LinkedServiceReference"
},
"secretName": "adls-storage-key"
}
}
}
}
Key Vault Security Best Practices
# Enable purge protection
az keyvault update --name kv-dataengineering --enable-purge-protection true
# Enable soft delete
az keyvault update --name kv-dataengineering --enable-soft-delete true
# Set network rules
az keyvault update --name kv-dataengineering \
--default-action Deny \
--bypass AzureServices
# Enable logging
az monitor diagnostic-settings create \
--resource "/subscriptions/xxx/resourceGroups/rg/providers/Microsoft.KeyVault/vaults/kv-dataengineering" \
--name "KeyVaultDiagnostics" \
--workspace "/subscriptions/xxx/resourceGroups/rg/providers/Microsoft.OperationalInsights/workspaces/law-dataengineering" \
--logs '[{"category":"AuditEvent","enabled":true}]'
β οΈ
Security Critical: Always enable purge protection and soft delete for Key Vault. Restrict network access with firewall rules. Use Managed Identities instead of connection strings.
Interview Questions
Q1: How do you implement secret rotation in Key Vault? A: Use Azure Functions with timer triggers to rotate secrets periodically. Store new secrets in Key Vault and update references in ADF linked services. Use Key Vault RBAC to restrict access.
Q2: What is the difference between Key Vault access policy and RBAC? A: Access Policy is legacy (per-vault permissions). RBAC is recommended (per-resource, supports inheritance). RBAC provides finer-grained control and integrates with Azure AD.
Q3: How do you handle Key Vault in multi-region deployments? A: Use separate Key Vault per region, or use geo-replicated Key Vault (Premium). Replicate secrets/keys across regions using Azure CLI or SDK. Reference regional Key Vault in regional resources.