Snowflake Advanced Β· Interview Prep
Query Performance & Result Cache
Difficulty: Hard Β· Commonly asked at Netflix, Amazon, Google
Interview Question
"A query that processes 1TB of data takes 30 seconds. After a data load, it now takes 5 minutes. Walk me through your diagnostic process and optimization strategies."
βΉοΈ
Companies Asking This: Netflix (Staff Data Engineer), Amazon (L6 Data Engineer), Google (Senior Data Engineer), Meta (Data Platform Engineer)
Performance Diagnostic Framework
Step 1: Query Analysis with PROFILE
-- Run query with PROFILE to get detailed metrics
SELECT
c.customer_name,
COUNT(o.order_id) AS order_count,
SUM(o.amount) AS total_spent
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_date >= '2024-01-01'
GROUP BY 1
ORDER BY 3 DESC;
-- Get query profile
SELECT
query_id,
query_text,
user_name,
warehouse_name,
compilation_time_ms,
execution_time_ms,
total_elapsed_time_ms,
bytes_scanned / (1024*1024*1024) AS gb_scanned,
bytes_written / (1024*1024) AS mb_written,
bytes_spilled_to_local_storage / (1024*1024) AS mb_spill_local,
bytes_spilled_to_remote_storage / (1024*1024) AS mb_spill_remote,
partitions_scanned,
partitions_total,
ROUND(partitions_scanned * 100.0 / NULLIF(partitions_total, 0), 2) AS scan_percentage,
result_cache_hit
FROM snowflake.account_usage.query_history
WHERE query_text LIKE '%customers c JOIN orders o%'
AND start_time >= DATEADD(hour, -24, CURRENT_TIMESTAMP())
ORDER BY total_elapsed_time_ms DESC
LIMIT 5;
Result Cache
-- Check if result cache was used
SELECT
query_id,
query_text,
result_cache_hit, -- 1 = cached, 0 = executed
compilation_time_ms,
execution_time_ms,
total_elapsed_time_ms
FROM snowflake.account_usage.query_history
WHERE query_text LIKE '%SELECT%FROM orders%'
AND start_time >= DATEADD(hour, -1, CURRENT_TIMESTAMP())
ORDER BY start_time DESC;
-- Force cache bypass for testing
SELECT * FROM orders
/* NO_RESULT_CACHE */
WHERE order_date = CURRENT_DATE();
-- Cache invalidation scenarios:
-- 1. Underlying data changes
-- 2. Volatile functions (CURRENT_DATE(), RANDOM())
-- 3. User role changes
-- 4. Schema changes
-- Test cache behavior
-- First execution (not cached)
SELECT COUNT(*) FROM orders WHERE order_date = CURRENT_DATE();
-- Second execution (cached)
SELECT COUNT(*) FROM orders WHERE order_date = CURRENT_DATE();
-- Check result
SELECT
query_id,
result_cache_hit,
total_elapsed_time_ms
FROM snowflake.account_usage.query_history
WHERE query_text LIKE '%COUNT(*)%orders%CURRENT_DATE%'
ORDER BY start_time DESC;
Scan Depth Analysis
-- Analyze micro-partition pruning efficiency
SELECT
query_id,
partitions_scanned,
partitions_total,
ROUND(partitions_scanned * 100.0 / NULLIF(partitions_total, 0), 2) AS scan_pct,
CASE
WHEN partitions_scanned = partitions_total THEN 'FULL SCAN - CRITICAL'
WHEN partitions_scanned > partitions_total * 0.5 THEN 'POOR PRUNING - WARNING'
WHEN partitions_scanned > partitions_total * 0.1 THEN 'MODERATE PRUNING'
ELSE 'EXCELLENT PRUNING'
END AS pruning_quality
FROM snowflake.account_usage.query_history
WHERE query_type = 'SELECT'
AND start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
AND partitions_total > 100
ORDER BY scan_pct DESC
LIMIT 20;
-- Compare pruning before and after data load
-- Before load
SELECT
'BEFORE_LOAD' AS period,
AVG(partitions_scanned * 100.0 / partitions_total) AS avg_scan_pct,
AVG(total_elapsed_time_ms) AS avg_query_time_ms
FROM snowflake.account_usage.query_history
WHERE query_text LIKE '%orders%'
AND start_time BETWEEN '2024-01-10' AND '2024-01-15'
UNION ALL
-- After load
SELECT
'AFTER_LOAD' AS period,
AVG(partitions_scanned * 100.0 / partitions_total) AS avg_scan_pct,
AVG(total_elapsed_time_ms) AS avg_query_time_ms
FROM snowflake.account_usage.query_history
WHERE query_text LIKE '%orders%'
AND start_time BETWEEN '2024-01-16' AND '2024-01-20';
Real-World Scenario: Netflix
Question: "How do you optimize a dashboard that queries 50TB of data and needs sub-second response times?"
Solution: Multi-Layer Optimization
-- 1. Create materialized view for common aggregations
CREATE MATERIALIZED VIEW mv_dashboard_summary AS
SELECT
DATE_TRUNC('hour', event_timestamp) AS event_hour,
event_type,
region,
device_type,
COUNT(*) AS event_count,
COUNT(DISTINCT user_id) AS unique_users,
SUM(CASE WHEN event_type = 'purchase' THEN revenue ELSE 0 END) AS total_revenue
FROM events
WHERE event_timestamp >= DATEADD(day, -90, CURRENT_DATE())
GROUP BY 1, 2, 3, 4;
-- 2. Create clustering key aligned with dashboard queries
ALTER TABLE events CLUSTER BY (event_timestamp, event_type, region);
-- 3. Use result cache aggressively
-- Dashboard queries should use same query structure for cache hits
-- Instead of:
SELECT * FROM events WHERE event_date = CURRENT_DATE() AND event_type = 'click';
-- Use:
SELECT * FROM events WHERE event_date = '2024-01-15' AND event_type = 'click';
-- 4. Create search optimization for common filters
ALTER TABLE events SET SEARCH_OPTIMIZATION = ON;
ALTER TABLE events ADD SEARCH OPTIMIZATION ON (event_type, region, device_type);
-- 5. Monitor dashboard performance
SELECT
query_id,
query_text,
total_elapsed_time_ms,
bytes_scanned / (1024*1024*1024) AS gb_scanned,
result_cache_hit,
CASE
WHEN result_cache_hit = 1 THEN 'CACHED'
WHEN total_elapsed_time_ms < 1000 THEN 'FAST'
WHEN total_elapsed_time_ms < 5000 THEN 'ACCEPTABLE'
ELSE 'SLOW - NEEDS OPTIMIZATION'
END AS performance_status
FROM snowflake.account_usage.query_history
WHERE query_text LIKE '%dashboard%'
AND start_time >= DATEADD(hour, -1, CURRENT_TIMESTAMP())
ORDER BY total_elapsed_time_ms DESC;
Real-World Scenario: Amazon
Question: "A query joins 5 tables and spills to remote storage. How do you optimize without changing the query?"
Optimization Without Query Changes
-- 1. Analyze spill metrics
SELECT
query_id,
bytes_spilled_to_local_storage / (1024*1024*1024) AS local_spill_gb,
bytes_spilled_to_remote_storage / (1024*1024*1024) AS remote_spill_gb,
execution_time_ms / 1000 AS execution_seconds
FROM snowflake.account_usage.query_history
WHERE query_id = 'your-query-id';
-- 2. Increase warehouse size
ALTER WAREHOUSE analytics_wh SET WAREHOUSE_SIZE = 'X-LARGE';
-- 3. Add clustering keys to joined tables
ALTER TABLE orders CLUSTER BY (customer_id, order_date);
ALTER TABLE customers CLUSTER BY (customer_id);
ALTER TABLE products CLUSTER BY (product_id);
-- 4. Create materialized view for the join
CREATE MATERIALIZED VIEW orders_customers_products AS
SELECT
o.order_id,
o.order_date,
o.amount,
c.customer_name,
c.segment,
p.product_name,
p.category
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN products p ON o.product_id = p.product_id;
-- 5. Monitor improvement
SELECT
'BEFORE' AS optimization,
AVG(bytes_spilled_to_remote_storage / (1024*1024*1024)) AS avg_remote_spill_gb,
AVG(execution_time_ms / 1000) AS avg_execution_seconds
FROM snowflake.account_usage.query_history
WHERE query_id = 'before-optimization-query-id'
UNION ALL
SELECT
'AFTER' AS optimization,
AVG(bytes_spilled_to_remote_storage / (1024*1024*1024)) AS avg_remote_spill_gb,
AVG(execution_time_ms / 1000) AS avg_execution_seconds
FROM snowflake.account_usage.query_history
WHERE query_id = 'after-optimization-query-id';
Performance Monitoring Dashboard
-- 1. Query performance trends
SELECT
DATE(start_time) AS query_date,
AVG(total_elapsed_time_ms) / 1000 AS avg_seconds,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_elapsed_time_ms) / 1000 AS median_seconds,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY total_elapsed_time_ms) / 1000 AS p95_seconds,
COUNT(*) AS query_count,
SUM(CASE WHEN result_cache_hit = 1 THEN 1 ELSE 0 END) AS cached_queries
FROM snowflake.account_usage.query_history
WHERE query_type = 'SELECT'
AND start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP())
GROUP BY 1
ORDER BY 1;
-- 2. Warehouse performance comparison
SELECT
warehouse_name,
warehouse_size,
AVG(total_elapsed_time_ms) / 1000 AS avg_query_seconds,
SUM(bytes_scanned / (1024*1024*1024)) AS total_tb_scanned,
SUM(credits_used) AS total_credits,
SUM(bytes_scanned / (1024*1024*1024)) / NULLIF(SUM(credits_used), 0) AS tb_per_credit
FROM snowflake.account_usage.query_history q
JOIN snowflake.account_usage.warehouse_metering_history w
ON q.warehouse_name = w.warehouse_name
WHERE q.start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
GROUP BY 1, 2
ORDER BY 5 DESC;
-- 3. Identify performance bottlenecks
SELECT
CASE
WHEN compilation_time_ms > 1000 THEN 'COMPILATION'
WHEN bytes_spilled_to_remote_storage > 0 THEN 'REMOTE_SPILL'
WHEN partitions_scanned = partitions_total THEN 'FULL_SCAN'
WHEN total_elapsed_time_ms > 30000 THEN 'LONG_RUNNING'
ELSE 'NORMAL'
END AS bottleneck_type,
COUNT(*) AS query_count,
AVG(total_elapsed_time_ms) / 1000 AS avg_seconds
FROM snowflake.account_usage.query_history
WHERE query_type = 'SELECT'
AND start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
GROUP BY 1
ORDER BY 2 DESC;
Best Practices
| Issue | Solution |
|---|---|
| Full table scans | Add clustering keys, use search optimization |
| Remote spill | Increase warehouse size, reduce data volume |
| Slow compilation | Simplify queries, use materialized views |
| Low cache hit rate | Use consistent query patterns, avoid volatile functions |
| High queue depth | Use multi-cluster warehouse |
| Slow joins | Cluster tables by join keys |
β οΈ
Performance Anti-Patterns:
- **SELECT *** β Only select needed columns
- Missing WHERE clauses β Filter early to reduce data scanned
- Using functions in WHERE β Prevents predicate pushdown
- Not monitoring β Set up alerts for query time degradation
- Ignoring spill β Remote spill kills performance