Security Architecture Overview
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AWS Security Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Identity & Access Data Protection Network Security β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β IAM β β Encryption β β VPC β β
β β Cognito β β KMS β β Security Grpsβ β
β β SSO β β Secrets Mgr β β NACLs β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β
β Monitoring Governance Compliance β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β CloudTrail β β Config β β Artifact β β
β β GuardDuty β β Lake Form. β β Audit Managerβ β
β β Security Hub β β Data Catalog β β Macie β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Q1: How do you implement the principle of least privilege in AWS?
Answer:
Least Privilege Implementation:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::data-lake",
"arn:aws:s3:::data-lake/sales/*"
],
"Condition": {
"StringEquals": {
"aws:PrincipalTag/Department": "sales"
}
}
}
]
}
IAM Best Practices:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Least Privilege Implementation β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β 1. Start with Minimum Permissions β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β’ Begin with no access β β
β β β’ Add permissions as needed β β
β β β’ Use IAM Access Analyzer to identify unused permissionsβ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β 2. Use Conditions for Fine-Grained Control β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β’ Resource-based conditions β β
β β β’ Request-based conditions β β
β β β’ Tag-based conditions β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β 3. Regular Access Reviews β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β’ IAM Access Analyzer reports β β
β β β’ CloudTrail audit logs β β
β β β’ Quarterly access reviews β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Q2: How do you implement encryption at rest and in transit?
Answer:
Encryption Strategies:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Encryption Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β At Rest β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β S3: SSE-S3, SSE-KMS, SSE-C β β
β β EBS: AES-256 (default) β β
β β RDS: KMS encryption β β
β β Redshift: KMS or hardware security modules β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β In Transit β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β TLS 1.2/1.3 for all API calls β β
β β SSL/TLS for database connections β β
β β VPC endpoints for private communication β β
β β VPN for hybrid connectivity β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Column-Level Encryption β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Redshift: ENCODE directive β β
β β DynamoDB: Client-side encryption β β
β β Kinesis: Encryption at rest β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
S3 Encryption Configuration:
# Default encryption
s3 = boto3.client('s3')
s3.put_bucket_encryption(
Bucket='data-lake-bucket',
ServerSideEncryptionConfiguration={
'Rules': [
{
'ApplyServerSideEncryptionByDefault': {
'SSEAlgorithm': 'aws:kms',
'KMSMasterKeyID': 'arn:aws:kms:us-east-1:123456789:key/key-id'
},
'BucketKeyEnabled': True
}
]
}
)
# Require encryption in bucket policy
bucket_policy = {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Deny",
"Principal": "*",
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::data-lake/*",
"Condition": {
"StringNotEquals": {
"s3:x-amz-server-side-encryption": "aws:kms"
}
}
}
]
}
Redshift Column Encryption:
-- Create table with column encryption
CREATE TABLE customers (
customer_id INT,
name VARCHAR(100),
email VARCHAR(200),
ssn VARCHAR(11) ENCODE LZO,
credit_card VARCHAR(19) ENCODE az64
)
DISTSTYLE KEY
DISTKEY(customer_id);
Q3: How do you implement data masking and tokenization?
Answer:
Data Masking Patterns:
# Dynamic data masking
class DataMasker:
def __init__(self):
self.masking_rules = {
'email': self.mask_email,
'ssn': self.mask_ssn,
'credit_card': self.mask_credit_card,
'phone': self.mask_phone
}
def mask_email(self, value):
if not value:
return value
local, domain = value.split('@')
masked_local = local[0] + '*' * (len(local) - 2) + local[-1]
return f"{masked_local}@{domain}"
def mask_ssn(self, value):
if not value:
return value
return f"***-**-{value[-4:]}"
def mask_credit_card(self, value):
if not value:
return value
return f"****-****-****-{value[-4:]}"
def mask_phone(self, value):
if not value:
return value
return f"***-***-{value[-4:]}"
Tokenization Service:
import hashlib
import secrets
class TokenizationService:
def __init__(self):
self.token_store = {}
self.reverse_store = {}
def tokenize(self, sensitive_value):
# Generate token
token = secrets.token_hex(16)
# Store mapping
self.token_store[token] = sensitive_value
self.reverse_store[ζζ_value] = token
return token
def detokenize(self, token):
return self.token_store.get(token)
def hash_tokenize(self, sensitive_value):
# One-way tokenization using SHA-256
salt = secrets.token_hex(16)
hash_value = hashlib.sha256(
f"{salt}:{sensitive_value}".encode()
).hexdigest()
return f"{salt}:{hash_value}"
AWS Glue Data Masking:
from pyspark.sql.functions import udf, col
from pyspark.sql.types import StringType
@udf(returnType=StringType())
def mask_pii(value, mask_type):
if not value:
return value
if mask_type == 'email':
local, domain = value.split('@')
return f"{local[0]}***@{domain}"
elif mask_type == 'ssn':
return f"***-**-{value[-4:]}"
elif mask_type == 'name':
return value[0] + '*' * (len(value) - 1)
return value
# Apply masking
df = df.withColumn(
"masked_email",
mask_pii(col("email"), lit("email"))
)
Q4: How do you implement compliance frameworks (GDPR, HIPAA, PCI)?
Answer:
Compliance Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Compliance Framework Implementation β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β GDPR Requirements AWS Services Implementation β
β ββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββ
β β Right to Erasure βββββΆβ S3 Delete/Lifecycleββ Automated ββ
β β Data Portability βββββΆβ Athena/Glue Export ββ Deletion ββ
β β Consent Mgmt βββββΆβ DynamoDB ββ ββ
β ββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββ
β β
β HIPAA Requirements AWS Services Implementation β
β ββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββ
β β Access Controls βββββΆβ IAM/Lake Form. ββ BA with AWS ββ
β β Audit Logging βββββΆβ CloudTrail ββ Encryption ββ
β β Encryption βββββΆβ KMS ββ ββ
β ββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββ
β β
β PCI DSS Requirements AWS Services Implementation β
β ββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββ
β β Network Security βββββΆβ VPC/Security Grp ββ Quarterly ββ
β β Access Control βββββΆβ IAM ββ Pen Tests ββ
β β Encryption βββββΆβ KMS ββ ββ
β ββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
GDPR Right to Erasure Implementation:
class GDPRErasure:
def __init__(self):
self.data_stores = {
's3': self.erase_s3_data,
'dynamodb': self.erase_dynamodb_data,
'redshift': self.erase_redshift_data
}
def erase_user_data(self, user_id):
results = {}
for store, erase_func in self.data_stores.items():
try:
erase_func(user_id)
results[store] = 'success'
except Exception as e:
results[store] = f'error: {str(e)}'
# Log erasure for audit
self.log_erasure(user_id, results)
return results
def erase_s3_data(self, user_id):
s3 = boto3.client('s3')
# Find and delete user data
paginator = s3.get_paginator('list_objects_v2')
pages = paginator.paginate(Bucket='data-lake')
for page in pages:
for obj in page.get('Contents', []):
if user_id in obj['Key']:
s3.delete_object(
Bucket='data-lake',
Key=obj['Key']
)
def log_erasure(self, user_id, results):
# Log for compliance audit
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('gdpr_audit_log')
table.put_item(
Item={
'user_id': user_id,
'action': 'DATA_ERASURE',
'timestamp': datetime.now().isoformat(),
'results': results
}
)
Q5: How do you implement audit logging and monitoring?
Answer:
Audit Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Audit Logging Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β CloudTrail CloudWatch Security Hub β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β API Logs βββββββΆβ Log Groups ββββββΆβ Aggregated β β
β β β β Metrics β β Findings β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β S3 Logs β β Alarms β β Compliance β β
β β (Central) β β β β Dashboard β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
CloudTrail Configuration:
# Create CloudTrail
cloudtrail = boto3.client('cloudtrail')
cloudtrail.create_trail(
Name='data-audit-trail',
S3BucketName='audit-logs-bucket',
IncludeGlobalServiceEvents=True,
IsMultiRegionTrail=True,
EnableLogFileValidation=True
)
cloudtrail.start_logging(Name='data-audit-trail')
# S3 data event logging
cloudtrail.put_event_selectors(
TrailName='data-audit-trail',
EventSelectors=[
{
'ReadWriteType': 'All',
'IncludeManagementEvents': True,
'DataResources': [
{
'Type': 'AWS::S3::Object',
'Values': ['arn:aws:s3:::data-lake/']
},
{
'Type': 'AWS::Lambda::Function',
'Values': ['arn:aws:lambda:us-east-1:123456789:function:']
}
]
}
]
)
Custom Audit Events:
class AuditLogger:
def __init__(self):
self.cloudwatch = boto3.client('cloudwatch')
def log_data_access(self, user_id, resource, action, details):
# CloudWatch metric for auditing
self.cloudwatch.put_metric_data(
Namespace='DataAudit',
MetricData=[
{
'MetricName': 'DataAccess',
'Dimensions': [
{'Name': 'UserId', 'Value': user_id},
{'Name': 'Resource', 'Value': resource},
{'Name': 'Action', 'Value': action}
],
'Value': 1,
'Unit': 'Count'
}
]
)
# Structured log for CloudWatch Logs
log_entry = {
'timestamp': datetime.now().isoformat(),
'user_id': user_id,
'resource': resource,
'action': action,
'details': details
}
print(json.dumps(log_entry))
Q6: How do you implement network security for data pipelines?
Answer:
Network Security Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Network Security Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Public Subnet Private Subnet β
β βββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ β
β β βββββββββββββββββββ β β βββββββββββββββββββββββββββββββ β β
β β β Load Balancer β β β β EMR Cluster β β β
β β β (Public) β β β β (Private) β β β
β β βββββββββββββββββββ β β βββββββββββββββββββββββββββββββ β β
β β βββββββββββββββββββ β β βββββββββββββββββββββββββββββββ β β
β β β NAT Gateway β β β β Redshift Cluster β β β
β β β β β β β (Private) β β β
β β βββββββββββββββββββ β β βββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ β
β β
β VPC Endpoints β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β S3 Gateway Endpoint β β
β β DynamoDB Gateway Endpoint β β
β β Interface Endpoints (Glue, KMS, etc.) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
VPC Configuration:
# Create VPC endpoint for S3
ec2 = boto3.client('ec2')
ec2.create_vpc_endpoint(
VpcId='vpc-12345678',
ServiceName='com.amazonaws.us-east-1.s3',
RouteTableIds=['rtb-12345678'],
PolicyDocument=json.dumps({
"Statement": [
{
"Effect": "Allow",
"Principal": "*",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::data-lake",
"arn:aws:s3:::data-lake/*"
]
}
]
})
)
# Security group for data pipeline
ec2.create_security_group(
GroupName='data-pipeline-sg',
Description='Security group for data pipeline',
VpcId='vpc-12345678'
)
# Restrict access
ec2.authorize_security_group_ingress(
GroupId='sg-12345678',
IpPermissions=[
{
'IpProtocol': 'tcp',
'FromPort': 443,
'ToPort': 443,
'UserIdGroupPairs': [
{'GroupId': 'sg-87654321'} # Only from trusted SG
]
}
]
)
Q7: How do you implement secrets management?
Answer:
Secrets Management Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Secrets Management Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β AWS Secrets Manager AWS Systems Manager β
β βββββββββββββββββββββββ βββββββββββββββββββββββ β
β β β’ Database creds β β β’ Parameter Store β β
β β β’ API keys β β β’ String/Secure β β
β β β’ Certificates β β β’ Hierarchical β β
β β β’ Automatic rotationβ β β’ Encryption β β
β ββββββββββββ¬βββββββββββ ββββββββββββ¬βββββββββββ β
β β β β
β βββββββββββββ¬ββββββββββββ β
β βΌ β
β βββββββββββββββββββββββ β
β β Applications β β
β β (Lambda, ECS, etc) β β
β βββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Secrets Manager Implementation:
import boto3
import json
secrets_client = boto3.client('secretsmanager')
# Store secret
def store_secret(secret_name, secret_value):
secrets_client.create_secret(
Name=secret_name,
Description='Database credentials',
SecretString=json.dumps(secret_value),
Tags=[
{'Key': 'Environment', 'Value': 'production'},
{'Key': 'Service', 'Value': 'data-pipeline'}
]
)
# Retrieve secret
def get_secret(secret_name):
response = secrets_client.get_secret_value(SecretId=secret_name)
return json.loads(response['SecretString'])
# Rotate secret
def rotate_secret(secret_id):
secrets_client.rotate_secret(
SecretId=secret_id,
RotationLambdaARN='arn:aws:lambda:rotate-function',
RotationRules={
'AutomaticallyAfterDays': 30
}
)
Lambda Secrets Rotation:
def lambda_handler(event, context):
secret = event['SecretId']
token = event['ClientRequestToken']
step = event['Step']
if step == 'createSecret':
# Generate new secret
new_password = generate_password()
secrets_client.put_secret_value(
SecretId=secret,
ClientRequestToken=token,
SecretString=json.dumps({'password': new_password}),
VersionStages=['AWSPENDING']
)
elif step == 'setSecret':
# Set secret in RDS
connection = get_rds_connection()
connection.execute(
"ALTER USER admin IDENTIFIED BY %s",
(new_password,)
)
elif step == 'testSecret':
# Test new secret
connection = get_rds_connection()
connection.ping()
elif step == 'finishSecret':
# Finalize rotation
secrets_client.update_secret_version_stage(
SecretId=secret,
VersionStage='AWSCURRENT',
MoveToVersionId=token,
RemoveFromVersionId=previous_version
)
Q8: How do you implement data classification?
Answer:
Data Classification Framework:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Data Classification Framework β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Classification Levels β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Public - No restrictions β β
β β Internal - Employees only β β
β β Confidential - Limited access β β
β β Restricted - Highly sensitive, strict controls β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Data Types β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β PII - Personal Identifiable Information β β
β β PHI - Protected Health Information β β
β β PCI - Payment Card Industry β β
β β IP - Intellectual Property β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β AWS Services β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Macie - S3 sensitive data discovery β β
β β Glue - Data Catalog classifications β β
β β Lake Form. - Fine-grained permissions β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Amazon Macie Implementation:
# Enable Macie
macie = boto3.client('macie2')
macie.create_classification_job(
jobType='ONE_TIME',
name='sensitive-data-scan',
s3JobDefinition={
'bucketDefinitions': [
{
'accountId': '123456789',
'buckets': ['data-lake-bucket']
}
],
'scoping': {
'includes': {
'and': [
{
'simpleScopeTerm': {
'comparator': 'STARTS_WITH',
'key': 'PREFIX',
'value': 'sensitive/'
}
}
]
}
}
},
customDataIdentifierIds=['pii-detector']
)
# Custom data identifier
macie.create_custom_data_identifier(
name='email-detector',
regex='[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}',
description='Detects email addresses',
severityLevels=[
{'severity': 5, 'description': 'Low'},
{'severity': 10, 'description': 'Medium'}
]
)
Q9: How do you implement incident response for data breaches?
Answer:
Incident Response Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Incident Response Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Detection Analysis Containment β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β GuardDuty ββββΆβ Security βββββΆβ Isolate β β
β β Macie β β Hub β β Resources β β
β β CloudWatch β β β β β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β
β Eradication Recovery Post-Incident β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β Remove ββββΆβ Restore βββββΆβ Lessons β β
β β Threat β β Services β β Learned β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
GuardDuty Integration:
# Enable GuardDuty
guardduty = boto3.client('guardduty')
# Create detector
detector = guardduty.create_detector(
Enable=True,
FindingPublishingFrequency='FIFTEEN_MINUTES',
DataSources={
'S3Logs': {'Enable': True},
'Kubernetes': {'AuditLogs': {'Enable': True}},
'MalwareProtection': {'ScanEc2InstanceWithFindings': {'EbsVolumes': {'Enable': True}}}
}
)
# Automated response
def lambda_handler(event, context):
# Parse GuardDuty finding
finding = json.loads(event['Records'][0]['Sns']['Message'])
severity = finding['severity']
resource = finding['resource']
# High severity - immediate response
if severity >= 8:
# Isolate EC2 instance
ec2 = boto3.client('ec2')
instance_id = resource['instanceDetails']['instanceId']
# Apply quarantine security group
ec2.modify_instance_attribute(
InstanceId=instance_id,
Groups=['quarantine-sg']
)
# Snapshot EBS volumes for forensics
ec2.create_snapshot(
VolumeId=volume_id,
Description=f'Forensics snapshot for finding {finding["id"]}'
)
# Notify security team
notify_security_team(finding)
Q10: How do you implement data loss prevention (DLP)?
Answer:
DLP Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Data Loss Prevention Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Data Discovery Classification Protection β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β Macie ββββββββΆβ Glue Data ββββββΆβ Lake Form. β β
β β Inspector β β Catalog β β S3 Policies β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β
β Monitoring Response Prevention β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β CloudWatch ββββββββΆβ Lambda ββββββΆβ IAM β β
β β Events β β Functions β β SCPs β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
S3 DLP Policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "DenyPublicAccess",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::sensitive-data",
"arn:aws:s3:::sensitive-data/*"
],
"Condition": {
"Bool": {
"aws:SecureTransport": "false"
}
}
},
{
"Sid": "RequireEncryption",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::sensitive-data/*",
"Condition": {
"StringNotEquals": {
"s3:x-amz-server-side-encryption": "aws:kms"
}
}
}
]
}
Q11: How do you implement zero trust architecture?
Answer:
Zero Trust Components:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Zero Trust Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Identity Verification β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β’ Multi-factor authentication β β
β β β’ Certificate-based authentication β β
β β β’ Continuous authentication β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Device Verification β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β’ Device posture checks β β
β β β’ Patch compliance β β
β β β’ Endpoint detection β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Network Micro-Segmentation β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β’ VPC per workload β β
β β β’ Security groups as firewalls β β
β β β’ Network policies β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Continuous Monitoring β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β’ Behavior analytics β β
β β β’ Anomaly detection β β
β β β’ Real-time threat intelligence β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
IAM Identity Center Implementation:
# Configure SSO
sso_admin = boto3.client('sso-admin')
# Create permission set
sso_admin.create_permission_set(
Name='DataEngineerAccess',
Description='Access for data engineers',
SessionDuration='PT8H',
ManagedPolicies=[
'arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess',
'arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess'
]
)
# Assign users
sso_admin.create_account_assignment(
InstanceArn='arn:aws:sso:::instance/ssoins-12345678',
TargetId='123456789012',
TargetType='AWS_ACCOUNT',
PrincipalType='USER',
PrincipalId='user-id',
PermissionSetArn='arn:aws:sso:::permissionSet/ssoins-12345678/ps-12345678'
)
Q12: How do you implement data residency requirements?
Answer:
Data Residency Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Data Residency Implementation β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Regional Requirements β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β EU: Data must stay in EU (GDPR) β β
β β China: Data must stay in China (China regulations) β β
β β Russia: Data must stay in Russia (Data localization) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β AWS Implementation β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β’ Region-specific resources β β
β β β’ SCPs to prevent region movement β β
β β β’ Cross-region replication controls β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Service Control Policy (SCP):
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "DenyNonEURegions",
"Effect": "Deny",
"NotAction": [
"iam:*",
"sts:*",
"organizations:*",
"account:*"
],
"Resource": "*",
"Condition": {
"StringNotEquals": {
"aws:RequestedRegion": [
"eu-west-1",
"eu-west-2",
"eu-west-3",
"eu-central-1"
]
}
}
}
]
}
Q13: How do you implement security monitoring and alerting?
Answer:
Security Monitoring Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Security Monitoring Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Data Sources Collection Analysis β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β VPC Flow βββββββΆβ CloudWatch βββββΆβ Security β β
β β CloudTrail β β Logs β β Hub β β
β β GuardDuty β β β β β β
β βββββββββββββββ βββββββββββββββ ββββββββ¬βββββββ β
β β β
β βΌ β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β SNS ββββββββ Lambda ββββββ Detective β β
β β Alerts β β Actions β β β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Security Hub Integration:
# Enable Security Hub
securityhub = boto3.client('securityhub')
securityhub.enable_security_hub(
Tags={
'Environment': 'production'
}
)
# Custom insight
securityhub.create_insight(
Name='High Severity Findings',
Filters={
'SeverityNormalized': [{'Gte': 70}]
},
GroupByArn='resourceType'
)
# Automated response to findings
def lambda_handler(event, context):
finding = json.loads(event['Records'][0]['Sns']['Message'])
severity = finding['Severity']['Normalized']
product = finding['ProductArn']
# High severity - immediate action
if severity >= 90:
# Isolate resource
isolate_resource(finding['Resources'][0]['Id'])
# Notify SOC
notify_soc_team(finding)
# Create ticket
create_incident_ticket(finding)
# Medium severity - investigate
elif severity >= 70:
# Add to investigation queue
add_to_investigation_queue(finding)
# Notify security team
notify_security_team(finding)
Q14: How do you implement secure data sharing?
Answer:
Secure Data Sharing Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Secure Data Sharing Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Producer Account Consumer Account β
β βββββββββββββββββββββββ βββββββββββββββββββββββ β
β β S3 Bucket β β Cross-Account β β
β β (Source Data) β β Access Role β β
β ββββββββββββ¬βββββββββββ ββββββββββββ¬βββββββββββ β
β β β β
β β βββββββββββββββ β β
β βββββΆβ KMS ββββββββ β
β β (Shared Key)β β
β βββββββββββββββ β
β β
β Lake Formation Sharing β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β’ Column-level permissions β β
β β β’ Row-level filters β β
β β β’ Time-based access β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Cross-Account KMS Key Sharing:
# Create KMS key with cross-account access
kms = boto3.client('kms')
key = kms.create_key(
Description='Cross-account data sharing key',
Policy=json.dumps({
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowCrossAccountAccess",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::consumer-account:root"
},
"Action": [
"kms:Decrypt",
"kms:DescribeKey"
],
"Resource": "*"
}
]
})
)
# Create alias
kms.create_alias(
AliasName='alias/cross-account-key',
TargetKeyId=key['KeyMetadata']['KeyId']
)
Q15: How do you implement security testing for data pipelines?
Answer:
Security Testing Framework:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Security Testing Framework β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Static Analysis Dynamic Analysis Pen Testing β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β Code Review β β DAST β β Pen Tests β β
β β SAST β β Fuzzing β β Red Team β β
β β Dependency β β Runtime β β β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β
β Compliance Scanning Vulnerability Scanning β
β βββββββββββββββββββββββ βββββββββββββββββββββββ β
β β Inspector β β GuardDuty β β
β β Config Rules β β Macie β β
β β Security Hub β β ECR Scanning β β
β βββββββββββββββββββββββ βββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Automated Security Scanning:
# AWS Inspector for vulnerability scanning
inspector = boto3.client('inspector2')
# Create assessment target
inspector.create_assessment_target(
assessmentTargetName='data-pipeline-scan',
resourceGroupArn='arn:aws:inspector2:us-east-1:123456789:resourcegroup/12345678'
)
# Schedule weekly scans
events = boto3.client('events')
events.put_rule(
Name='WeeklySecurityScan',
ScheduleExpression='cron(0 2 ? * SUN)',
State='ENABLED'
)
events.put_targets(
Rule='WeeklySecurityScan',
Targets=[
{
'Id': 'SecurityScanLambda',
'Arn': 'arn:aws:lambda:us-east-1:123456789:function:security-scan'
}
]
)
Q16: How do you implement API security for data services?
Answer:
API Security Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β API Security Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Client API Gateway Backend β
β βββββββββββ βββββββββββββββ βββββββββββββββ β
β β App βββββββββΆβ WAF ββββββββΆβ Lambda β β
β β β β Throttling β β (Auth) β β
β β β β Auth β β β β
β βββββββββββ βββββββββββββββ βββββββββββββββ β
β β
β Security Layers β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β 1. WAF - Rate limiting, IP blocking, SQL injection β β
β β 2. API Keys - Usage tracking, quotas β β
β β 3. IAM Auth - AWS signature verification β β
β β 4. Cognito - User pools, OAuth 2.0 β β
β β 5. Lambda Authorizers - Custom auth logic β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
WAF Configuration:
# Create WAF Web ACL
waf = boto3.client('wafv2')
waf.create_web_acl(
Name='data-api-protection',
Scope='REGIONAL',
DefaultAction={'Allow': {}},
Rules=[
{
'Name': 'RateLimit',
'Priority': 1,
'Action': {'Block': {}},
'Statement': {
'RateBasedStatement': {
'Limit': 2000,
'AggregateKeyType': 'IP'
}
},
'VisibilityConfig': {
'SampledRequestsEnabled': True,
'CloudWatchMetricsEnabled': True,
'MetricName': 'RateLimit'
}
},
{
'Name': 'SQLInjection',
'Priority': 2,
'Action': {'Block': {}},
'Statement': {
'SqliMatchStatement': {
'FieldToMatch': {'Body': {}},
'TextTransformations': [
{'Priority': 0, 'Type': 'URL_DECODE'}
]
}
},
'VisibilityConfig': {
'SampledRequestsEnabled': True,
'CloudWatchMetricsEnabled': True,
'MetricName': 'SQLInjection'
}
}
],
VisibilityConfig={
'SampledRequestsEnabled': True,
'CloudWatchMetricsEnabled': True,
'MetricName': 'DataAPIProtection'
}
)
Q17: How do you implement compliance auditing?
Answer:
Compliance Auditing Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Compliance Auditing Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Config Rules Audit Manager Artifact β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β Compliance βββββββΆβ Evidence βββββββΆβ Compliance β β
β β Checks β β Collection β β Reports β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β Remediation β β Dashboard β β External β β
β β Lambda β β (QuickSight)β β Auditors β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
AWS Config Rules:
# Create Config rule for encryption
config = boto3.client('config')
config.put_config_rule(
ConfigRule={
'ConfigRuleName': 's3-bucket-server-side-encryption-enabled',
'Source': {
'Owner': 'AWS',
'SourceIdentifier': 'S3_BUCKET_SERVER_SIDE_ENCRYPTION_ENABLED'
},
'Scope': {
'ComplianceResourceTypes': ['AWS::S3::Bucket']
}
}
)
# Create custom Config rule
config.put_config_rule(
ConfigRule={
'ConfigRuleName': 'data-lake-encryption-check',
'Source': {
'Owner': 'CUSTOM_LAMBDA',
'SourceIdentifier': 'arn:aws:lambda:us-east-1:123456789:function:check-encryption',
'SourceDetails': [
{
'MessageSource': 'CONFIG'
}
]
}
}
)
Q18: How do you implement disaster recovery for security infrastructure?
Answer:
Security DR Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Security Infrastructure DR β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Primary Region DR Region β
β βββββββββββββββββββββββ βββββββββββββββββββββββ β
β β IAM β β IAM β β
β β (Users/Roles) β β (Replicated) β β
β βββββββββββββββββββββββ βββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββ βββββββββββββββββββββββ β
β β KMS Keys β β KMS Keys β β
β β (Primary) β β (Replicated) β β
β βββββββββββββββββββββββ βββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββ βββββββββββββββββββββββ β
β β Security Hub β β Security Hub β β
β β (Findings) β β (Replicated) β β
β βββββββββββββββββββββββ βββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
IAM DR Strategy:
# Export IAM configuration
def export_iam_config():
iam = boto3.client('iam')
# Export users
users = iam.list_users()['Users']
# Export roles
roles = iam.list_roles()['Roles']
# Export policies
policies = iam.list_policies(Scope='Local')['Policies']
# Store in S3 for DR
s3 = boto3.client('s3')
s3.put_object(
Bucket='dr-config-bucket',
Key='iam/config.json',
Body=json.dumps({
'users': users,
'roles': roles,
'policies': policies
}, default=str)
)
Q19: How do you implement real-time threat detection?
Answer:
Real-Time Threat Detection Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Real-Time Threat Detection β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Data Sources Analysis Response β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β VPC Flow βββββββΆβ GuardDuty βββββΆβ Lambda β β
β β CloudTrail β β Machine β β Actions β β
β β DNS Logs β β Learning β β β β
β βββββββββββββββ ββββββββ¬βββββββ ββββββββ¬βββββββ β
β β β β
β βΌ βΌ β
β βββββββββββββββ βββββββββββββββ β
β β Detective β β SNS β β
β β Investigationβ β Alerts β β
β βββββββββββββββ βββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
GuardDuty Finding Response:
def lambda_handler(event, context):
# Parse GuardDuty finding
finding = json.loads(event['Records'][0]['Sns']['Message'])
finding_type = finding['type']
severity = finding['severity']
# Categorize and respond
if 'UnauthorizedAccess' in finding_type:
# Potential intrusion
response = handle_intrusion(finding)
elif 'CryptoCurrency' in finding_type:
# Cryptomining
response = handle_cryptomining(finding)
elif 'Trojan' in finding_type:
# Malware
response = handle_malware(finding)
return response
def handle_intrusion(finding):
instance_id = finding['resource']['instanceDetails']['instanceId']
# Isolate instance
ec2 = boto3.client('ec2')
ec2.modify_instance_attribute(
InstanceId=instance_id,
Groups=['quarantine-sg']
)
# Snapshot for forensics
volumes = ec2.describe_volumes(
Filters=[{'Name': 'attachment.instance-id', 'Values': [instance_id]}]
)['Volumes']
for volume in volumes:
ec2.create_snapshot(
VolumeId=volume['VolumeId'],
Description=f'Forensics: {finding["id"]}'
)
# Notify security team
notify_security_team(finding, 'intrusion')
Q20: How do you implement secure DevOps for data pipelines?
Answer:
Secure DevOps Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Secure DevOps for Data Pipelines β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Code Build Deploy β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β Git βββββΆβ CodeBuild βββββΆβ CloudForm. β β
β β (Encrypted) β β (SAST) β β (Review) β β
β β Secrets β β Dep Scan β β β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Security Gates β β
β β β’ SAST/DAST scanning β β
β β β’ Dependency vulnerability check β β
β β β’ Infrastructure as Code scanning β β
β β β’ Secret detection β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
CodePipeline Security:
# Create secure pipeline
codepipeline = boto3.client('codepipeline')
pipeline = codepipeline.create_pipeline(
pipeline={
'name': 'secure-data-pipeline',
'roleArn': 'arn:aws:iam::role/PipelineRole',
'artifactStore': {
'type': 'S3',
'location': 'pipeline-artifacts-bucket'
},
'stages': [
{
'name': 'Source',
'actions': [
{
'name': 'Source',
'actionTypeId': {
'category': 'Source',
'owner': 'AWS',
'provider': 'CodeCommit',
'version': '1'
},
'configuration': {
'RepositoryName': 'data-pipeline',
'BranchName': 'main'
}
}
]
},
{
'name': 'Build',
'actions': [
{
'name': 'SecurityScan',
'actionTypeId': {
'category': 'Build',
'owner': 'AWS',
'provider': 'CodeBuild',
'version': '1'
},
'configuration': {
'ProjectName': 'security-scan'
}
}
]
},
{
'name': 'Deploy',
'actions': [
{
'name': 'Deploy',
'actionTypeId': {
'category': 'Deploy',
'owner': 'AWS',
'provider': 'CloudFormation',
'version': '1'
},
'configuration': {
'ActionMode': 'CHANGE_SET_REPLACE',
'StackName': 'data-pipeline-stack',
'ChangeSetName': 'pipeline-changeset'
}
}
]
}
]
}
)
Q21: How do you implement data sovereignty?
Answer:
Data Sovereignty Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Data Sovereignty Implementation β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Requirements Implementation Controls β
β ββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββ
β β EU Data β β Region-specific β β SCPs ββ
β β China Data β β Resources β β Config Rules ββ
β β Russia Data β β VPC Isolation β β Monitoring ββ
β ββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββ
β β
β Cross-Border Restrictions β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β’ Block cross-region data movement β β
β β β’ Encrypt data with region-specific keys β β
β β β’ Audit all data access patterns β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Region-Specific Implementation:
# Deploy infrastructure per region
class DataSovereigntyManager:
def __init__(self):
self.regions = {
'eu': ['eu-west-1', 'eu-west-2', 'eu-central-1'],
'china': ['cn-north-1', 'cn-northwest-1'],
'us': ['us-east-1', 'us-west-2']
}
def deploy_region_specific(self, region_group):
regions = self.regions[region_group]
for region in regions:
# Create region-specific resources
self.create_s3_bucket(region)
self.create_kms_key(region)
self.create_vpc(region)
self.create_redshift(region)
def create_s3_bucket(self, region):
s3 = boto3.client('s3', region_name=region)
s3.create_bucket(
Bucket=f'data-sovereign-{region}',
CreateBucketConfiguration={
'LocationConstraint': region
}
)
# Apply encryption
s3.put_bucket_encryption(
Bucket=f'data-sovereign-{region}',
ServerSideEncryptionConfiguration={
'Rules': [
{
'ApplyServerSideEncryptionByDefault': {
'SSEAlgorithm': 'aws:kms'
}
}
]
}
)
Q22: How do you implement secure backup and recovery?
Answer:
Secure Backup Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Secure Backup Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Backup Sources Backup Storage Recovery β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β RDS ββββββββΆβ S3 βββββββΆβ Restore β β
β β Redshift β β (Encrypted) β β (Cross-Account)β β
β β EBS β β (Versioned) β β β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β
β Security Controls β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β’ Encryption at rest (KMS) β β
β β β’ Access controls (IAM) β β
β β β’ Immutability (Object Lock) β β
β β β’ Cross-region replication β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Immutable Backup with S3 Object Lock:
# Create bucket with Object Lock
s3 = boto3.client('s3')
s3.create_bucket(
Bucket='immutable-backups',
ObjectLockEnabledForBucket=True
)
# Enable versioning (required for Object Lock)
s3.put_bucket_versioning(
Bucket='immutable-backups',
VersioningConfiguration={'Status': 'Enabled'}
)
# Set default retention
s3.put_object_lock_configuration(
Bucket='immutable-backups',
ObjectLockConfiguration={
'ObjectLockEnabled': 'Enabled',
'Rule': {
'DefaultRetention': {
'Mode': 'COMPLIANCE',
'Years': 7
}
}
}
)
Q23: How do you implement security for streaming data?
Answer:
Streaming Security Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Streaming Security Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Producers Stream Processing Consumers β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β Producers βββββββΆβ Kinesis βββββββββΆβ Consumers β β
β β (Encrypted) β β (Encrypted) β β (Encrypted) β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β
β Security Controls β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β’ KMS encryption for data at rest β β
β β β’ TLS for data in transit β β
β β β’ IAM policies for access control β β
β β β’ VPC endpoints for private connectivity β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Kinesis Encryption:
# Create encrypted KMS stream
kinesis = boto3.client('kinesis')
kinesis.create_stream(
StreamName='secure-stream',
ShardCount=4
)
# Enable encryption
kinesis.start_stream_encryption(
StreamName='secure-stream',
EncryptionType='KMS',
KeyId='arn:aws:kms:us-east-1:123456789:key/key-id'
)
# IAM policy for secure access
iam = boto3.client('iam')
policy = {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"kinesis:GetRecords",
"kinesis:GetShardIterator",
"kinesis:DescribeStream",
"kinesis:ListStreams"
],
"Resource": "arn:aws:kinesis:us-east-1:123456789:stream/secure-stream",
"Condition": {
"StringEquals": {
"kinesis:EncryptionType": "KMS"
}
}
}
]
}
Q24: How do you implement security for data lakes?
Answer:
Data Lake Security Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Data Lake Security Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Perimeter Security Data Security Access Control β
β ββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββ
β β WAF β β Encryption β β Lake Form. ββ
β β Shield β β Classification β β IAM ββ
β β VPC β β Masking β β S3 Policies ββ
β ββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββ
β β
β Monitoring Governance Compliance β
β ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ β
β β CloudTrail β β Data Catalog β β Macie β β
β β GuardDuty β β Lineage β β Config β β
β β Security Hub β β Quality β β Audit Mgr β β
β ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Lake Formation Security:
# Register and secure data lake
lakeformation = boto3.client('lakeformation')
# Register location
lakeformation.register_resource(
ResourceArn='arn:aws:s3:::data-lake-bucket',
RoleArn='arn:aws:iam::role/LakeFormationRole'
)
# Grant database permissions
lakeformation.grant_permissions(
Principal={'DataLakePrincipalIdentifier': 'arn:aws:iam::role/data-engineer'},
Resource={
'Database': {
'Name': 'analytics_db'
}
},
Permissions=['CREATE_TABLE', 'ALTER', 'DROP']
)
# Grant table permissions with conditions
lakeformation.grant_permissions(
Principal={'DataLakePrincipalIdentifier': 'arn:aws:iam::role/analyst'},
Resource={
'Table': {
'DatabaseName': 'analytics_db',
'Name': 'sales_data'
}
},
Permissions=['SELECT'],
ConditionExpression='owner = current_principal()'
)
Q25: How do you implement a security governance framework?
Answer:
Security Governance Framework:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Security Governance Framework β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Policies Standards Procedures β
β ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ β
β β Security Policy β β Encryption Std β β Incident Respβ β
β β Data Policy β β Access Std β β Change Mgmt β β
β β Acceptable Use β β Network Std β β Patch Mgmt β β
β ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ β
β β
β Metrics Reporting Improvement β
β ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ β
β β KPIs β β Dashboards β β Remediation β β
β β SLAs β β Compliance Rpts β β Training β β
β β Risk Scores β β Audit Reports β β Awareness β β
β ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Security Metrics Dashboard:
# Security metrics
class SecurityMetrics:
def __init__(self):
self.cloudwatch = boto3.client('cloudwatch')
def calculate_security_score(self):
metrics = {
'encryption_coverage': self.get_encryption_coverage(),
'access_control_score': self.get_access_control_score(),
'vulnerability_score': self.get_vulnerability_score(),
'compliance_score': self.get_compliance_score()
}
# Weighted average
weights = {
'encryption_coverage': 0.3,
'access_control_score': 0.3,
'vulnerability_score': 0.2,
'compliance_score': 0.2
}
total_score = sum(
metrics[k] * weights[k] for k in metrics
)
return {
'overall_score': total_score,
'metrics': metrics,
'timestamp': datetime.now().isoformat()
}
def get_encryption_coverage(self):
# Check encryption across all S3 buckets
s3 = boto3.client('s3')
buckets = s3.list_buckets()['Buckets']
encrypted_count = 0
for bucket in buckets:
try:
s3.get_bucket_encryption(Bucket=bucket['Name'])
encrypted_count += 1
except ClientError:
pass
return (encrypted_count / len(buckets)) * 100 if buckets else 0
Summary
Mastering AWS data security requires understanding:
- Identity & Access: IAM, least privilege, MFA, SSO
- Data Protection: Encryption at rest/in transit, key management
- Compliance: GDPR, HIPAA, PCI DSS implementations
- Monitoring: CloudTrail, GuardDuty, Security Hub
- Governance: Policies, standards, procedures, metrics
These concepts form the foundation for building secure, compliant data systems on AWS.