Virtual Warehouse Management: Multi-Cluster, Auto-Suspend & Scaling

Architecture Diagram 1: Warehouse Hierarchy & Sizing

Architecture Diagram 2: Multi-Cluster Scaling Behavior

Architecture Diagram 3: Warehouse Hierarchy

Architecture Diagram 4: Multi-Cluster Auto-Scaling Timeline

Architecture Diagram 5: Scaling Policies Comparison

A multi-cluster warehouse automatically manages multiple independent compute clusters under a logical warehouse. When the primary cluster becomes overloaded (queue depth exceeds threshold), Snowflake starts additional clusters. When load decreases, clusters are suspended to reduce costs.

Scaling policies control how aggressively warehouses add clusters. STANDARD scales quickly (queue depth ≥ 1 triggers new cluster), prioritizing performance. ECONOMY scales conservatively (queue depth ≥ 6 triggers), prioritizing cost savings with some queueing tolerance.

Credit Consumption Rate

credits\_per\_second = base\_rate \times 2^{size\_index}

Warehouse Memory Model

memory_per_node = 2 GB × 2^size_index
memory_per_vwh = memory_per_node × nodes_per_cluster
Example: XL warehouse = 32 GB/node × 4 nodes = 128 GB total
Local SSD cache = memory_per_node × 1 (1:1 ratio)
Spillage occurs when working set exceeds memory → spills to local SSD → remote storage

Theorem: Auto-suspend after T seconds of inactivity guarantees zero credit consumption for idle periods, with resume latency bounded by cloud provider cold-start time (typically 30–60 seconds). Proof: Warehouse suspends all compute nodes when idle time ≥ T. No queries execute during suspension. Auto-resume re-initializes nodes on next query submission. The 60-second minimum billing granularity ensures no partial-second charges.

Set AUTO_SUSPEND=60 for development warehouses (frequent idle periods), AUTO_SUSPEND=300 for production (balance between responsiveness and cost). Use SCALING_POLICY=ECONOMY for batch ETL where some queueing is acceptable.

Monitor queue depth: Target < 5 queries waiting; > 10 indicates under-provisioning
Check CPU utilization: 60–80% is optimal; > 95% indicates need to scale up
Review auto-scaling events: Frequent ADD_CLUSTER suggests under-provisioned MAX_CLUSTER_COUNT
Analyze credit consumption: Compare credits_used vs. queries_completed for efficiency
Right-size: Scale down if utilization consistently < 40%; scale up if consistently > 90%
Test: Run representative workload at new size and verify queue depth remains < 5

Multi-cluster warehouses auto-scale 1–10 clusters based on queue depth
Credit billing: Per-second (60s minimum) based on warehouse size × time
STANDARD policy: Fast scaling for interactive workloads
ECONOMY policy: Conservative scaling for batch ETL (cost-optimized)
Auto-suspend: 60–86400 seconds; balance responsiveness vs. cost

Detailed Explanation

What are Virtual Warehouses?

Virtual warehouses are independent compute clusters that execute SQL queries and DML operations. Unlike traditional fixed-size appliances, they are elastic and scale dynamically based on workload demands.

Virtual Warehouse Architecture

Each warehouse: independent nodes with CPU, RAM, and local SSD cache
Complete isolation — multiple warehouses access same data without contention
Cloud services layer manages query routing and metadata independently

Multi-Cluster Warehouse Operations

Automatically manages multiple independent clusters under one logical warehouse
Scaling triggers: queue depth exceeding threshold
Scaling behavior:
- STANDARD policy — scales aggressively (queue ≥ 1)
- ECONOMY policy — scales conservatively (queue ≥ 6)
Queries routed to cluster with lowest queue depth
Suspended clusters complete running queries before shutdown

Auto-Suspend and Auto-Resume

Feature	Configuration	Behavior
Auto-Suspend	60–86,400 seconds	Suspends after inactivity
Auto-Resume	TRUE/FALSE	Resumes on next query (30–60s typical)

Common defaults: 60s for dev, 300s for production
Zero credit consumption during suspension

Resource Monitoring and Cost Control

Warehouse-level monitors — individual workload cost attribution
Account-level monitors — aggregate consumption visibility
Triggers: NOTIFY, SUSPEND, SUSPEND_IMMEDIATELY at threshold percentages
Email notifications and external monitoring integration

Performance Optimization Strategies

Right-size — scale up for complex analytics, down for simple queries
Scaling policies — STANDARD for interactive, ECONOMY for batch ETL
Statement timeouts — STATEMENT_TIMEOUT (max execution) + STATEMENT_QUEUED_TIMEOUT (max queue wait)
Leverage result cache — avoid non-deterministic functions for cache hits

Key Takeaway: Use SCALING_POLICY=ECONOMY for batch ETL where queueing is acceptable, STANDARD for interactive workloads requiring fast response.

Key Concepts Table

Parameter	Range	Default	Description
WAREHOUSE_SIZE	XS-4XL	M	Compute capacity per node
MIN_CLUSTER_COUNT	1-10	1	Minimum active clusters
MAX_CLUSTER_COUNT	1-10	1	Maximum auto-scaled clusters
SCALING_POLICY	STANDARD/ECONOMY	STANDARD	Scaling aggressiveness
AUTO_SUSPEND	60-86400	600	Inactivity timeout (seconds)
AUTO_RESUME	TRUE/FALSE	TRUE	Auto-resume on query
STATEMENT_TIMEOUT	0-86400	0	Max execution time (seconds)

Warehouse Size	vCPU	Memory	Max Concurrent	Cost/Hour
X-Small	1	2 GB	1	~$0.00036/s
Small	2	4 GB	2	~$0.00072/s
Medium	4	8 GB	4	~$0.00144/s
Large	8	16 GB	8	~$0.00288/s
X-Large	16	32 GB	16	~$0.00576/s
2X-Large	32	64 GB	32	~$0.01152/s
3X-Large	64	128 GB	64	~$0.02304/s
4X-Large	128	256 GB	128	~$0.04608/s

Metric	Target	Warning	Critical
Queue Depth	< 5	5-10	> 10
CPU Utilization	60-80%	80-95%	> 95%
Auto-Suspend Time	60-300s	300-600s	> 600s
Credit Consumption	< 80% quota	80-95%	> 95%

Code Examples

-- Example 1: Create warehouse with comprehensive settings
CREATE WAREHOUSE analytics_wh
    WAREHOUSE_SIZE = 'xlarge'
    MIN_CLUSTER_COUNT = 2
    MAX_CLUSTER_COUNT = 8
    SCALING_POLICY = 'STANDARD'
    AUTO_SUSPEND = 300
    AUTO_RESUME = TRUE
    INITIALLY_SUSPENDED = FALSE
    STATEMENT_TIMEOUT_IN_SECONDS = 3600
    STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = 600
    RESOURCE_MONITOR = 'analytics_monitor'
    COMMENT = 'Multi-cluster warehouse for analytics workloads';

-- Example 2: Create resource monitor
CREATE RESOURCE MONITOR account_monitor
    WITH
    CREDIT_QUOTA = 10000
    FREQUENCY = MONTHLY
    START_TIMESTAMP = IMMEDIATELY
    TRIGGERS
        ON 80% DO NOTIFY
        ON 90% DO SUSPEND
        ON 95% DO SUSPEND_IMMEDIATELY;

-- Example 3: Create warehouse-specific monitor
CREATE RESOURCE MONITOR etl_monitor
    WITH
    CREDIT_QUOTA = 5000
    FREQUENCY = MONTHLY
    TRIGGERS
        ON 75% DO NOTIFY
        ON 90% DO SUSPEND;

-- Assign monitor to warehouse
ALTER WAREHOUSE etl_wh SET RESOURCE_MONITOR = etl_monitor;

-- Example 4: Monitor warehouse performance
SELECT 
    warehouse_name,
    warehouse_size,
    cluster_number,
    AVG(queries_completed) as avg_queries,
    AVG(queued_overload_queries) as avg_queued,
    AVG(execution_time_ms) / 1000 as avg_exec_seconds,
    SUM(credits_used) as total_credits
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
WHERE start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP())
GROUP BY 1, 2, 3
ORDER BY total_credits DESC;

-- Example 5: Analyze auto-scaling events
SELECT 
    warehouse_name,
    event_name,
    event_timestamp,
    event_reason,
    cluster_number
FROM TABLE(INFORMATION_SCHEMA.WAREHOUSE_LOAD_HISTORY(
    START_TIME => DATEADD(hour, -24, CURRENT_TIMESTAMP())
))
WHERE event_name IN ('RESUME', 'SUSPEND', 'ADD_CLUSTER', 'REMOVE_CLUSTER')
ORDER BY event_timestamp DESC;

-- Example 6: Monitor current warehouse status
SELECT 
    warehouse_name,
    state,
    num_queries,
    num_queued_queries,
    time_active,
    time_suspended
FROM TABLE(INFORMATION_SCHEMA.WAREHOUSE_LOAD_HISTORY(
    START_TIME => DATEADD(hour, -1, CURRENT_TIMESTAMP())
))
ORDER BY time_active DESC;

-- Example 7: Dynamic warehouse scaling
-- Scale up warehouse for batch processing
ALTER WAREHOUSE etl_wh SET WAREHOUSE_SIZE = '2xlarge';

-- Scale out warehouse for peak load
ALTER WAREHOUSE analytics_wh SET MAX_CLUSTER_COUNT = 10;

-- Scale in warehouse during off-hours
ALTER WAREHOUSE analytics_wh SET MAX_CLUSTER_COUNT = 2;

-- Example 8: Warehouse usage analysis
SELECT 
    warehouse_name,
    DATE_TRUNC('day', start_time) as usage_date,
    SUM(credits_used) as daily_credits,
    SUM(credits_used) * 3 as daily_cost_usd,  -- ~$3/credit
    AVG(queries_completed) as avg_queries_per_hour
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
WHERE start_time >= DATEADD(day, -90, CURRENT_TIMESTAMP())
GROUP BY 1, 2
ORDER BY usage_date DESC, daily_credits DESC;

-- Example 9: Optimize warehouse for specific workload
-- Create warehouse optimized for large scans
CREATE WAREHOUSE scan_wh
    WAREHOUSE_SIZE = '3xlarge'
    AUTO_SUSPEND = 600
    STATEMENT_TIMEOUT_IN_SECONDS = 7200
    COMMENT = 'Optimized for large table scans';

-- Create warehouse for interactive queries
CREATE WAREHOUSE interactive_wh
    WAREHOUSE_SIZE = 'medium'
    AUTO_SUSPEND = 60
    STATEMENT_TIMEOUT_IN_SECONDS = 300
    STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = 60
    COMMENT = 'Optimized for interactive queries';

-- Example 10: Monitor query performance by warehouse
SELECT 
    warehouse_name,
    AVG(execution_time_ms) / 1000 as avg_exec_seconds,
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY execution_time_ms) / 1000 as median_exec_seconds,
    PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY execution_time_ms) / 1000 as p95_exec_seconds,
    COUNT(*) as total_queries,
    SUM(CASE WHEN execution_time_ms > 60000 THEN 1 ELSE 0 END) as slow_queries
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
  AND execution_status = 'SUCCESS'
GROUP BY 1
ORDER BY avg_exec_seconds DESC;

Performance Metrics

Metric	Target	Warning	Critical	Description
Query Queue Depth	< 5	5-10	> 10	Queries waiting to execute
Cluster Scaling Time	< 30s	30-60s	> 60s	Time to start new cluster
Auto-Suspend Delay	60-300s	300-600s	> 600s	Time before warehouse suspends
Auto-Resume Time	< 60s	60-120s	> 120s	Time to resume suspended warehouse
Credit Utilization	60-80%	80-95%	> 95%	Active time vs total time

Best Practices

Right-size warehouses: Start with Medium for ad-hoc queries, scale up for batch ETL. Monitor queue depth to identify under-provisioned warehouses.
Use multi-cluster warehouses: Set MIN_CLUSTER_COUNT=1 and MAX_CLUSTER_COUNT=3-5 for concurrent workloads. Use SCALING_POLICY=ECONOMY for cost optimization.
Implement warehouse isolation: Create separate warehouses for different workloads (ETL, reporting, ad-hoc) to prevent resource contention and enable independent scaling.
Configure appropriate timeouts: Set STATEMENT_QUEUED_TIMEOUT for queries that shouldn't run indefinitely and STATEMENT_TIMEOUT to kill long-running queries.
Leverage resource monitors: Set up alerts for credit consumption to prevent unexpected costs. Use MAX_CREDIT_QUOTA per warehouse for budget control.
Optimize auto-suspend: Set AUTO_SUSPEND=60 for development warehouses, AUTO_SUSPEND=300 for production, and AUTO_SUSPEND=60 for infrequently used warehouses.
Monitor scaling events: Track warehouse load history to identify scaling patterns and optimize MIN/MAX_CLUSTER_COUNT settings.
Use warehouse groups: Organize warehouses by workload type (analytics, ETL, development) to simplify management and cost attribution.
Implement cost allocation: Use resource monitors and tags to attribute warehouse costs to specific departments or projects.
Regular performance reviews: Analyze warehouse performance metrics weekly to identify optimization opportunities and right-sizing needs.

Virtual Warehouse Management: Multi-Cluster, Auto-Suspend & Scaling

Virtual Warehouse Management: Multi-Cluster, Auto-Suspend & Scaling

Architecture Diagram 1: Warehouse Hierarchy & Sizing

Architecture Diagram 2: Multi-Cluster Scaling Behavior

Architecture Diagram 3: Warehouse Hierarchy

Architecture Diagram 4: Multi-Cluster Auto-Scaling Timeline

Architecture Diagram 5: Scaling Policies Comparison

Warehouse Memory Model

Detailed Explanation

What are Virtual Warehouses?

Virtual Warehouse Architecture

Multi-Cluster Warehouse Operations

Auto-Suspend and Auto-Resume

Resource Monitoring and Cost Control

Performance Optimization Strategies

Key Concepts Table

Code Examples

Performance Metrics

Best Practices

See Also

Premium Content

Need Expert Snowflake Help?