Snowflake Advanced Β· Interview Prep
Pricing Model & Cost Optimization
Difficulty: Medium Β· Commonly asked at Google, Meta, Amazon
Interview Question
"How do you calculate the total cost of ownership for a Snowflake deployment? Walk me through the pricing model and identify the top 5 cost optimization strategies."
βΉοΈ
Companies Asking This: Google (Senior Cloud Architect), Meta (Data Platform Engineer), Amazon (L6 Data Engineer), Microsoft (Principal Data Architect)
Snowflake Pricing Model
Snowflake pricing has two components: compute (credits) and storage.
Compute Costs (Credits)
| Warehouse Size | Credits/Hour | Credits/Second | Typical Use Case |
|---|---|---|---|
| X-Small | 1 | 0.000278 | Dev/Test |
| Small | 2 | 0.000556 | Interactive BI |
| Medium | 4 | 0.001111 | Standard workloads |
| Large | 8 | 0.002222 | Heavy analytics |
| X-Large | 16 | 0.004444 | ETL/Data Science |
| 2X-Large | 32 | 0.008889 | Large ETL |
| 3X-Large | 64 | 0.017778 | Enterprise-scale |
| 4X-Large | 128 | 0.035556 | Massively parallel |
Storage Costs
-- Check current storage usage
SELECT
table_name,
row_count,
bytes / (1024*1024*1024) AS storage_gb,
retention_time,
CASE
WHEN retention_time > 30 THEN 'HIGH RETENTION - COST IMPACT'
WHEN retention_time > 7 THEN 'MODERATE RETENTION'
ELSE 'LOW RETENTION'
END AS retention_impact
FROM information_schema.tables
WHERE table_schema = 'PUBLIC'
ORDER BY bytes DESC;
-- Calculate total storage cost
SELECT
SUM(bytes) / (1024*1024*1024*1024) AS total_storage_tb,
SUM(bytes) / (1024*1024*1024*1024) * 23 AS estimated_monthly_cost_usd -- $23/TB/month
FROM information_schema.tables;
-- Breakdown by database
SELECT
table_database,
SUM(bytes) / (1024*1024*1024) AS storage_gb,
SUM(bytes) / (1024*1024*1024) * 23 / 30 AS daily_cost_usd
FROM information_schema.tables
GROUP BY 1
ORDER BY 2 DESC;
Cost Optimization Strategies
Strategy 1: Auto-Suspend Optimization
-- Identify warehouses with poor auto-suspend settings
SELECT
warehouse_name,
warehouse_size,
auto_suspend,
CASE
WHEN auto_suspend = 0 THEN 'CRITICAL: Never suspends'
WHEN auto_suspend < 60 THEN 'WARNING: Very short'
WHEN auto_suspend <= 300 THEN 'GOOD'
ELSE 'CONSIDER REDUCING'
END AS recommendation
FROM information_schema.warehouses;
-- Calculate idle costs
SELECT
w.warehouse_name,
w.warehouse_size,
w.auto_suspend,
m.credits_used,
m.start_time,
m.end_time,
DATEDIFF('minute', m.start_time, m.end_time) AS active_minutes,
CASE
WHEN w.auto_suspend = 0 THEN
(24 * 60 - DATEDIFF('minute', m.start_time, m.end_time)) *
(CASE w.warehouse_size
WHEN 'X-Small' THEN 1/60
WHEN 'Small' THEN 2/60
WHEN 'Medium' THEN 4/60
WHEN 'Large' THEN 8/60
WHEN 'X-Large' THEN 16/60
END)
ELSE 0
END AS wasted_credits
FROM information_schema.warehouses w
JOIN snowflake.account_usage.warehouse_metering_history m
ON w.warehouse_name = m.warehouse_name
WHERE m.start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP());
Strategy 2: Right-Sizing Warehouses
-- Identify over-provisioned warehouses
SELECT
warehouse_name,
warehouse_size,
AVG(queries_executed) AS avg_queries_per_hour,
AVG(credits_used) AS avg_credits_per_hour,
CASE
WHEN warehouse_size IN ('X-Large', '2X-Large', '3X-Large', '4X-Large')
AND AVG(queries_executed) < 100 THEN 'DOWNSIZE'
WHEN warehouse_size IN ('Large', 'X-Large')
AND AVG(queries_executed) > 1000 THEN 'UPSIZE'
ELSE 'APPROPRIATE'
END AS sizing_recommendation
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
GROUP BY 1, 2;
-- Analyze query complexity vs warehouse size
SELECT
warehouse_name,
warehouse_size,
AVG(total_elapsed_time_ms) / 1000 AS avg_query_seconds,
AVG(bytes_scanned / (1024*1024*1024)) AS avg_gb_scanned,
CASE
WHEN AVG(total_elapsed_time_ms) < 1000 AND warehouse_size IN ('Large', 'X-Large')
THEN 'OVER-PROVISIONED'
WHEN AVG(total_elapsed_time_ms) > 10000 AND warehouse_size IN ('Small', 'Medium')
THEN 'UNDER-PROVISIONED'
ELSE 'APPROPRIATE'
END AS analysis
FROM snowflake.account_usage.query_history
WHERE query_type = 'SELECT'
AND start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
GROUP BY 1, 2;
Strategy 3: Resource Monitors
-- Create resource monitors for cost control
CREATE RESOURCE MONITOR monthly_budget
WITH CREDIT_QUOTA = 5000
TRIGGERS
ON 50% DO NOTIFY
ON 80% DO NOTIFY
ON 100% DO SUSPEND
ON 110% DO SUSPEND_IMMEDIATE;
-- Create per-warehouse monitors
CREATE RESOURCE MONITOR etl_monitor
WITH CREDIT_QUOTA = 2000
TRIGGERS
ON 80% DO NOTIFY
ON 100% DO SUSPEND;
-- Assign monitors to warehouses
ALTER WAREHOUSE etl_wh SET RESOURCE_MONITOR = etl_monitor;
ALTER WAREHOUSE bi_wh SET RESOURCE_MONITOR = monthly_budget;
-- Check monitor status
SELECT
name,
credit_quota,
credits_used,
credits_remaining,
suspend_at,
suspend_immediately_at,
notify_at
FROM information_schema.resource_monitors;
Strategy 4: Time Travel & Storage Optimization
-- Reduce retention for non-critical tables
ALTER TABLE dev_test_data SET DATA_RETENTION_TIME_IN_DAYS = 1;
ALTER TABLE staging_data SET DATA_RETENTION_TIME_IN_DAYS = 3;
-- Keep longer retention only for critical tables
ALTER TABLE production_financials SET DATA_RETENTION_TIME_IN_DAYS = 90;
-- Calculate storage cost impact of retention
SELECT
table_name,
retention_time,
bytes / (1024*1024*1024) AS current_storage_gb,
bytes / (1024*1024*1024) * retention_time / 30 AS estimated_total_storage_gb,
bytes / (1024*1024*1024) * retention_time / 30 * 23 / 30 AS daily_storage_cost_usd
FROM information_schema.tables
WHERE retention_time > 7
ORDER BY 5 DESC;
Strategy 5: Query Optimization for Cost
-- Identify expensive queries
SELECT
query_id,
query_text,
warehouse_name,
credits_used,
total_elapsed_time_ms / 1000 AS execution_seconds,
bytes_scanned / (1024*1024*1024) AS gb_scanned,
CASE
WHEN credits_used > 10 THEN 'HIGH COST'
WHEN credits_used > 1 THEN 'MODERATE COST'
ELSE 'LOW COST'
END AS cost_category
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
AND credits_used > 1
ORDER BY credits_used DESC
LIMIT 20;
-- Calculate cost per query pattern
SELECT
CASE
WHEN query_text LIKE '%SELECT%JOIN%' THEN 'JOIN'
WHEN query_text LIKE '%SELECT%GROUP BY%' THEN 'AGGREGATION'
WHEN query_text LIKE '%INSERT%' THEN 'INSERT'
WHEN query_text LIKE '%UPDATE%' THEN 'UPDATE'
WHEN query_text LIKE '%DELETE%' THEN 'DELETE'
ELSE 'OTHER'
END AS query_pattern,
COUNT(*) AS query_count,
SUM(credits_used) AS total_credits,
AVG(credits_used) AS avg_credits_per_query,
SUM(credits_used) * 10 AS estimated_cost_usd
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
GROUP BY 1
ORDER BY 2 DESC;
Cost Monitoring Dashboard
-- Daily cost trend
SELECT
DATE(start_time) AS usage_date,
SUM(credits_used) AS total_credits,
SUM(credits_used) * 10 AS estimated_cost_usd,
COUNT(DISTINCT warehouse_name) AS warehouses_used,
COUNT(*) AS total_queries
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP())
GROUP BY 1
ORDER BY 1;
-- Cost by warehouse
SELECT
warehouse_name,
warehouse_size,
SUM(credits_used) AS total_credits,
SUM(credits_used) * 10 AS estimated_cost_usd,
ROUND(SUM(credits_used) * 100.0 / (SELECT SUM(credits_used) * 10
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP())), 2) AS pct_of_total
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP())
GROUP BY 1, 2
ORDER BY 3 DESC;
-- Cost by user
SELECT
user_name,
SUM(credits_used) AS total_credits,
SUM(credits_used) * 10 AS estimated_cost_usd,
COUNT(*) AS query_count,
AVG(total_elapsed_time_ms) / 1000 AS avg_query_seconds
FROM snowflake.account_usage.query_history q
JOIN snowflake.account_usage.warehouse_metering_history w
ON q.warehouse_name = w.warehouse_name
WHERE q.start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP())
GROUP BY 1
ORDER BY 2 DESC;
Best Practices
| Strategy | Expected Savings |
|---|---|
| Auto-suspend optimization | 30-50% |
| Right-sizing warehouses | 20-40% |
| Resource monitors | Prevents overruns |
| Time travel reduction | 10-30% storage |
| Query optimization | 20-50% compute |
β οΈ
Cost Traps:
- AUTO_SUSPEND = 0 β Burns credits 24/7
- Over-sized warehouses β 4X-Large for small queries wastes money
- High retention β 90-day retention on dev tables is expensive
- No resource monitors β Unchecked credit consumption
- Ignoring idle time β Warehouses running without queries