πΈοΈ Data Mesh on AWS
Implement Data Mesh architecture using Lake Formation and cross-account sharing.
Module: AWS Data Engineering β’ Topic 22 of 65 β’ Premium Content
Data Mesh Architecture
Architecture Diagram
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DATA MESH ON AWS β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β DOMAIN: Sales β β
β β Account: 111111111111 β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β S3 Bucket: sales-data-lake β β β
β β β Glue Catalog: sales_catalog β β β
β β β Lake Formation: sales-domain-permissions β β β
β β β Tables: customers, orders, transactions β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ β
β β Share via Lake Formation β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β CENTRAL GOVERNANCE ACCOUNT (222222222222) β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Lake Formation Cross-Account Permissions β β β
β β β β’ Register locations β β β
β β β β’ Manage sharing policies β β β
β β β β’ Audit access patterns β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ β
β β Share via Lake Formation β
β βββββββββββββββββββΌββββββββββββββββββ β
β βΌ βΌ βΌ β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
β β Marketing β β Finance β β Analytics β β
β β (333333333333) β β (444444444444) β β (555555555555) β β
β β Query sales β β Query sales β β Query all β β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Cross-Account Sharing
import boto3
lakeformation = boto3.client('lakeformation')
# Share table with another account
lakeformation.register_resource(
ResourceArn='arn:aws:s3:::sales-data-lake/silver/customers/',
RoleArn='arn:aws:iam::111111111111:role/LakeFormationRole',
UseServiceLinkedRole=False
)
# Grant cross-account access
lakeformation.batch_grant_permissions(
Entries=[
{
'Id': 'share-customers',
'Principal': {
'DataLakePrincipalIdentifier': 'arn:aws:iam::333333333333:role/AnalystRole'
},
'Resource': {
'Table': {
'DatabaseName': 'sales_catalog',
'Name': 'customers'
}
},
'Permissions': ['SELECT'],
'GrantOption': False
}
]
)
Interview Q&A
Q1: What is Data Mesh?
Answer: Data Mesh is a decentralized data architecture where domain teams own and serve their data as products, using self-serve infrastructure.
Q2: How does Lake Formation enable Data Mesh?
Answer: Lake Formation provides cross-account sharing, fine-grained permissions, and centralized governance for decentralized data ownership.
Q3: What are the four principles of Data Mesh?
Answer: 1) Domain ownership, 2) Data as a product, 3) Self-serve platform, 4) Federated computational governance.
Summary
- Data Mesh: Decentralized, domain-oriented data architecture
- Lake Formation: Enables cross-account sharing with fine-grained permissions
- Cross-Account: Share tables/columns across AWS accounts
- Governance: Central policies with federated execution
- Data Products: Domains own and serve their data with quality SLAs