Snowflake Data Mesh Architecture
Data Mesh is a decentralized data architecture that organizes data around business domains, treating data as a product. Snowflake's architecture naturally supports Data Mesh patterns.
What is Data Mesh?
- Decentralized ownership by business domains
- Data treated as a product with defined SLAs
- Self-service discovery and federated governance
Architecture Overview
Domain-Specific Data Products
| Domain | Data Products | SLA |
|---|---|---|
| Sales | Revenue, Pipeline, Customers | 99.9% |
| Marketing | Campaigns, Attribution, Segments | 99.5% |
| Operations | Inventory, Logistics, Quality | 99.99% |
Platform Layer
A shared Platform layer provides:
- Discovery, catalog, and access management
- Lineage tracking and quality monitoring
- Self-service APIs and governance controls
Data Product Components
- Identity β Domain ownership, versioning, SLAs
- Interface β SQL views, API endpoints, Snowflake shares, streams/tasks
- Semantics β Schema definition, business glossary, lineage docs
- Discovery β Catalog entry, data contract, version history
Key Concepts
DfData Product
DfDomain Ownership
Domain Database Structure
-- Sales Domain
CREATE DATABASE sales_domain;
CREATE SCHEMA raw_data;
CREATE SCHEMA curated;
CREATE SCHEMA data_products;
CREATE SCHEMA analytics;
-- Marketing Domain
CREATE DATABASE marketing_domain;
CREATE SCHEMA raw_data;
CREATE SCHEMA curated;
CREATE SCHEMA data_products;
CREATE SCHEMA analytics;
Data Product Creation
-- Sales Domain: Revenue Data Product
CREATE OR REPLACE DATA PRODUCT revenue_data_product
DATABASE = sales_domain
SCHEMA = data_products
COMMENT = 'Revenue metrics aggregated by region and time'
SLA = '99.9% uptime, 15-minute freshness'
OWNERSHIP = 'Sales Domain Team'
AS (
SELECT
region,
DATE_TRUNC('day', order_date) as date,
SUM(amount) as total_revenue,
COUNT(DISTINCT customer_id) as unique_customers,
AVG(amount) as avg_order_value
FROM sales_domain.curated.orders
WHERE order_date >= DATEADD(year, -1, CURRENT_DATE())
GROUP BY 1, 2
);
Data Sharing Across Domains
-- Create share for cross-domain access
CREATE SHARE revenue_share;
GRANT USAGE ON DATABASE sales_domain TO SHARE revenue_share;
GRANT USAGE ON SCHEMA sales_domain.data_products TO SHARE revenue_share;
GRANT SELECT ON TABLE sales_domain.data_products.revenue_data_product TO SHARE revenue_share;
-- Marketing Domain consumes Sales data
CREATE DATABASE revenue_from_sales FROM SHARE account_org.revenue_share;
SELECT * FROM revenue_from_sales.sales_domain.data_products.revenue_data_product;
Federated Governance
-- Global governance rules
CREATE OR REPLACE ROLE data_product_manager;
CREATE OR REPLACE ROLE data_consumer;
-- Domain-level policies
CREATE OR REPLACE RESOURCE MONITOR domain_monitor
WITH CREDIT_QUOTA = 1000
NOTIFY_USERS = ('admin@company.com')
TRIGGERS ON 80% DO NOTIFY
TRIGGERS ON 100% DO SUSPEND;
-- Data quality checks
CREATE OR REPLACE TASK validate_data_product
WAREHOUSE = compute_wh
SCHEDULE = '1 HOUR'
AS
BEGIN
-- Check freshness
IF (SELECT MAX(updated_at) FROM sales_domain.data_products.revenue_data_product) <
DATEADD(hour, -2, CURRENT_TIMESTAMP()) THEN
RAISE ERROR(10001, 'Data product freshness SLA violated');
END IF;
END;
Implement data contracts for each data product defining schema, quality, freshness, and access requirements. Use Snowflake's governance features (masking, tagging, access policies) to enforce compliance automatically.
Data Mesh vs Traditional
| Aspect | Traditional | Data Mesh | Benefit |
|---|---|---|---|
| Architecture | Centralized | Decentralized | Domain agility |
| Ownership | Data team | Domain teams | Business alignment |
| Scaling | Vertical | Horizontal | Cost efficiency |
| Quality | Reactive | Proactive | Reliability |
| Discovery | Manual | Self-service | Faster access |
- Data Mesh organizes data around business domains
- Each domain owns and manages its data products
- Platform provides self-service discovery and governance
- Cross-domain sharing uses Snowflake shares and views
- Federated governance maintains standards while enabling autonomy