dbt Cloud Features
Cloud Architecture
Slim CI Architecture
Job Scheduling
Architecture Diagram
+-----------------------------------------------------------------------------+
| JOB SCHEDULING ARCHITECTURE |
+-----------------------------------------------------------------------------+
| |
| +---------------------------------------------------------------------+ |
| | JOB CONFIGURATION | |
| | | |
| | Job: Production Run | |
| | +-- Trigger: Scheduled (Daily 2:00 AM UTC) | |
| | +-- Environment: Production | |
| | +-- Commands: | |
| | | +-- dbt deps | |
| | | +-- dbt seed | |
| | | +-- dbt run --full-refresh | |
| | | +-- dbt test | |
| | +-- Notifications: | |
| | | +-- Slack: #data-engineering | |
| | | +-- Email: team@company.com | |
| | +-- Alert on: failure | |
| +---------------------------------------------------------------------+ |
| | |
| v |
| +---------------------------------------------------------------------+ |
| | SCHEDULE PATTERNS | |
| | | |
| | +-------------+----------------------------------------------+ | |
| | | Pattern | Configuration | | |
| | +-------------+----------------------------------------------+ | |
| | | Hourly | "0 * * * *" | | |
| | | Daily | "0 2 * * *" | | |
| | | Weekly | "0 2 * * 1" | | |
| | | Monthly | "0 2 1 * *" | | |
| | | Custom | "0 2 * * 1-5" (weekdays only) | | |
| | +-------------+----------------------------------------------+ | |
| +---------------------------------------------------------------------+ |
| |
+-----------------------------------------------------------------------------+
Detailed Explanation
dbt Cloud is the enterprise version of dbt that provides a comprehensive platform for data transformation with managed infrastructure, scheduling, and monitoring.
What are the Core Features?
Web-based IDE
- Write SQL and Jinja in the browser
- Git integration with visual diff
- Auto-completion and syntax highlighting
- Interactive documentation viewer
Job Scheduling
- Cron-based scheduling
- Event-driven triggers (Git, API)
- Dependency chains
- Alert notifications
Slim CI
- Selective execution of modified models
- State comparison between branches
- Cost optimization for CI/CD
- Fast feedback loops
Monitoring and Observability
- Run history and logs
- Performance metrics
- Cost tracking
- Error alerting
What are the Enterprise Features?
| Feature | Capabilities |
|---|---|
| SSO and Authentication | SAML 2.0, RBAC, audit logging, IP allowlisting |
| Multi-tenant Architecture | Environment isolation, resource quotas, cost allocation, compliance controls |
| Semantic Layer | Centralized metrics, version control, API access, BI integration |
| Mesh | Cross-project references, data contracts, shared semantic models, governance controls |
What are the Best Practices for dbt Cloud?
- Use Slim CI - Only test modified models
- Set up alerts - Notify on failures
- Monitor costs - Track warehouse usage
- Use environments - Separate dev/staging/prod
- Version control - All configurations in Git
- Document jobs - Clear naming and descriptions
- Test regularly - Automated quality checks
- Review logs - Monitor execution details
Key Takeaway: dbt Cloud provides enterprise-grade features for scheduling, monitoring, and governance, enabling teams to manage data transformation at scale.
Code Examples
Job Configuration (YAML)
# .dbt_cloud/job_config.yml
jobs:
- name: "Production Run"
description: "Daily production run for all models"
environment: "Production"
triggers:
- type: "scheduled"
cron: "0 2 * * *"
- type: "git_push"
branches: ["main"]
steps:
- command: "dbt deps"
- command: "dbt seed"
- command: "dbt run --full-refresh"
- command: "dbt test"
notifications:
- type: "slack"
channel: "#data-engineering"
- type: "email"
recipients:
- "data-eng@company.com"
settings:
warehouse: "ANALYTICS_WH"
schema: "production"
threads: 8
alert_on:
- "failure"
- "warning"
Slim CI Configuration
# .dbt_cloud/ci_config.yml
ci:
enabled: true
state:
compare:
- "manifest.json"
- "run_results.json"
selection:
strategy: "modified"
include:
- "state:modified+"
- "state:new+"
exclude:
- "tag:deprecated"
optimization:
enabled: true
max_run_time: "30m"
cost_threshold: 100
notifications:
on_success:
- type: "slack"
channel: "#ci-results"
on_failure:
- type: "slack"
channel: "#ci-alerts"
- type: "pagerduty"
severity: "critical"
Semantic Layer Configuration
# .dbt_cloud/semantic_layer.yml
semantic_layer:
enabled: true
metrics:
- name: "total_revenue"
type: "simple"
expression: "sum(amount)"
description: "Total revenue from all orders"
- name: "order_count"
type: "simple"
expression: "count(*)"
description: "Total number of orders"
- name: "avg_order_value"
type: "derived"
expression: "total_revenue / order_count"
description: "Average order value"
dimensions:
- name: "order_date"
type: "time"
granularity: "day"
- name: "customer_segment"
type: "categorical"
access:
- type: "bi_tool"
name: "Looker"
permissions: ["read"]
- type: "application"
name: "Feature Store"
permissions: ["read", "query"]
Monitoring Configuration
# .dbt_cloud/monitoring.yml
monitoring:
enabled: true
metrics:
- name: "run_duration"
type: "histogram"
alert_threshold: "30m"
- name: "cost_per_model"
type: "gauge"
alert_threshold: 10
alerts:
- name: "Long Running Job"
condition: "run_duration > 30m"
severity: "warning"
channels:
- "slack:#data-engineering"
- "email:data-eng@company.com"
- name: "High Cost Job"
condition: "total_cost > 500"
severity: "critical"
channels:
- "slack:#data-engineering"
- "pagerduty:data-eng"
dashboards:
- name: "Job Performance"
metrics: ["run_duration", "success_rate", "cost"]
refresh: "1h"
- name: "Cost Tracking"
metrics: ["cost_per_model", "cost_per_job"]
refresh: "1d"
Performance Metrics
| Feature | Description | Impact |
|---|---|---|
| Slim CI | Selective execution | 80-90% faster |
| Caching | Result caching | 50-70% faster |
| Parallelism | Concurrent execution | 2-3x faster |
| Monitoring | Real-time insights | Proactive alerts |
| Semantic Layer | Metric consistency | Improved governance |
Best Practices
- Use Slim CI - Only test modified models
- Set up alerts - Notify on failures
- Monitor costs - Track warehouse usage
- Use environments - Separate dev/staging/prod
- Version control - All configurations in Git
- Document jobs - Clear naming and descriptions
- Test regularly - Automated quality checks
- Review logs - Monitor execution details
See Also
- Data Quality Tests β Test scheduling and CI/CD integration
- Mesh & Data Collaboration β Cross-project references and governance
- Performance Tuning β Optimization strategies for dbt runs
- dbt Best Practices β CI/CD and operations patterns