dbt Project Configuration
Project Architecture
Dependency Management
Architecture Diagram
+-----------------------------------------------------------------------------+
| DEPENDENCY RESOLUTION FLOW |
+-----------------------------------------------------------------------------+
| |
| +---------------------------------------------------------------------+ |
| | PACKAGES.YML | |
| | | |
| | packages: | |
| | - package: dbt-labs/dbt_utils | |
| | version: [">=1.0.0", "<2.0.0"] | |
| | - package: calogica/dbt_expectations | |
| | version: [">=0.10.0"] | |
| | - git: "https://github.com/org/custom_package.git" | |
| | revision: main | |
| | - registry: dbt-labs/metrics | |
| | version: [">=0.3.0"] | |
| +---------------------------------------------------------------------+ |
| | |
| v |
| +---------------------------------------------------------------------+ |
| | DBT DEPS RESOLUTION | |
| | | |
| | 1. Parse packages.yml | |
| | 2. Resolve version constraints | |
| | 3. Download packages | |
| | 4. Install to dbt_packages/ | |
| | 5. Build dependency graph | |
| +---------------------------------------------------------------------+ |
| | |
| v |
| +---------------------------------------------------------------------+ |
| | INSTALLED PACKAGES | |
| | | |
| | dbt_packages/ | |
| | +-- dbt_utils/ | |
| | | +-- macros/ | |
| | | +-- dbt_project.yml | |
| | +-- dbt_expectations/ | |
| | | +-- macros/ | |
| | | +-- dbt_project.yml | |
| | +-- custom_package/ | |
| | +-- macros/ | |
| | +-- dbt_project.yml | |
| +---------------------------------------------------------------------+ |
| |
+-----------------------------------------------------------------------------+
Environment Configuration
Architecture Diagram
+-----------------------------------------------------------------------------+
| ENVIRONMENT MANAGEMENT |
+-----------------------------------------------------------------------------+
| |
| +---------------------------------------------------------------------+ |
| | PROFILE CONFIGURATION | |
| | | |
| | my_profile: | |
| | target: dev | |
| | outputs: | |
| | dev: | |
| | type: snowflake | |
| | account: my_account | |
| | user: my_user | |
| | password: "{{ env_var('DBT_PASSWORD') }}" | |
| | warehouse: compute_wh | |
| | database: analytics_dev | |
| | schema: dbt_{{ env_var('DBT_USER') }} | |
| | | |
| | prod: | |
| | type: snowflake | |
| | account: my_account | |
| | user: service_account | |
| | password: "{{ env_var('DBT_PROD_PASSWORD') }}" | |
| | warehouse: analytics_wh | |
| | database: analytics_prod | |
| | schema: public | |
| +---------------------------------------------------------------------+ |
| | |
| v |
| +---------------------------------------------------------------------+ |
| | ENVIRONMENT VARIABLES | |
| | | |
| | +-----------------+-------------------------------------------+ | |
| | | Variable | Purpose | | |
| | +-----------------+-------------------------------------------+ | |
| | | DBT_USER | Current user for dev schema | | |
| | | DBT_PASSWORD | Database password | | |
| | | DBT_ENV | Current environment (dev/prod) | | |
| | | DBT_HOST | Database host | | |
| | | DBT_WAREHOUSE | Compute warehouse | | |
| | +-----------------+-------------------------------------------+ | |
| +---------------------------------------------------------------------+ |
| |
+-----------------------------------------------------------------------------+
Detailed Explanation
dbt projects are the fundamental organizational unit in dbt. They contain all your models, tests, macros, and configurations.
What is Project Configuration?
The dbt_project.yml file is the central configuration file for your dbt project:
- Project metadata: Name, version, profile
- Model paths: Where to find models, seeds, tests
- Model configurations: Default materializations, schemas, tags
- Variables: Custom variables for dynamic configuration
- Clean targets: Files to remove during cleanup
What is Package Management?
Packages are reusable collections of macros, models, and tests:
| Package | Purpose |
|---|---|
| dbt-labs/dbt_utils | Core utility functions |
| calogica/dbt_expectations | Advanced testing |
| dbt-labs/codegen | Code generation |
| Custom packages | Organization-specific code |
What is Profile Configuration?
Profiles define how dbt connects to your data warehouse:
- Target environments: Dev, staging, production
- Connection details: Account, user, password
- Warehouse settings: Size, cluster, timeouts
- Schema configuration: Dynamic schemas per user
What are Environment Variables?
Environment variables allow dynamic configuration:
- Secrets: Passwords, tokens (never commit to Git)
- Environment-specific: Different settings per target
- User-specific: Per-developer configurations
- CI/CD: Pipeline-specific settings
What are the Project Best Practices?
- Use version control for all configuration files
- Never commit secrets - use environment variables
- Separate environments - dev, staging, production
- Document configurations - add comments and descriptions
- Test configurations - validate before deployment
- Use packages - leverage community code
- Version constraints - specify version ranges for packages
- Clean regularly - remove unused targets and packages
Key Takeaway: Proper project configuration ensures maintainability, security, and scalability of your dbt data transformation pipeline.
Code Examples
dbt_project.yml
# dbt_project.yml
name: 'my_analytics_project'
version: '1.0.0'
config-version: 2
profile: 'analytics'
model-paths: ["models"]
analysis-paths: ["analysis"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]
docs-paths: ["docs"]
clean-targets:
- "target"
- "dbt_packages"
- "dbt_modules"
models:
my_analytics_project:
staging:
+materialized: view
+schema: staging
+tags: ['staging']
intermediate:
+materialized: ephemeral
+tags: ['intermediate']
marts:
+materialized: incremental
+schema: analytics
+tags: ['mart', 'production']
finance:
+cluster_by: ['date', 'account_id']
marketing:
+partition_by: {
"field": "event_date",
"data_type": "date"
}
vars:
start_date: '2020-01-01'
enable_audit: true
default_currency: 'USD'
query-comment:
comment: "dbt: {{ node.unique_id }} | {{ node.description }}"
append: true
packages.yml
# packages.yml
packages:
- package: dbt-labs/dbt_utils
version: [">=1.0.0", "<2.0.0"]
- package: calogica/dbt_expectations
version: [">=0.10.0", "<1.0.0"]
- package: dbt-labs/codegen
version: [">=0.12.0"]
- package: elementary-data/elementary
version: [">=0.14.0"]
- git: "https://github.com/my-org/custom_dbt_package.git"
revision: main
- registry: dbt-labs/metrics
version: [">=0.3.0"]
profiles.yml
# profiles.yml
my_profile:
target: dev
outputs:
dev:
type: snowflake
account: "{{ env_var('SNOWFLAKE_ACCOUNT') }}"
user: "{{ env_var('SNOWFLAKE_USER') }}"
password: "{{ env_var('SNOWFLAKE_PASSWORD') }}"
role: TRANSFORMER
database: ANALYTICS_DEV
warehouse: COMPUTE_WH
schema: "dbt_{{ env_var('DBT_USER') }}"
client_session_keep_alive: false
query_tag: "dbt_dev"
staging:
type: snowflake
account: "{{ env_var('SNOWFLAKE_ACCOUNT') }}"
user: "{{ env_var('SNOWFLAKE_USER') }}"
password: "{{ env_var('SNOWFLAKE_PASSWORD') }}"
role: TRANSFORMER
database: ANALYTICS_STG
warehouse: ANALYTICS_WH
schema: public
client_session_keep_alive: true
query_tag: "dbt_staging"
prod:
type: snowflake
account: "{{ env_var('SNOWFLAKE_ACCOUNT') }}"
user: "{{ env_var('SNOWFLAKE_USER') }}"
password: "{{ env_var('SNOWFLAKE_PASSWORD') }}"
role: TRANSFORMER
database: ANALYTICS_PROD
warehouse: ANALYTICS_WH
schema: public
client_session_keep_alive: true
query_tag: "dbt_production"
Model Configuration
# models/marts/fct_orders.yml
version: 2
models:
- name: fct_orders
description: "Fact table for orders"
config:
materialized: incremental
unique_key: order_id
incremental_strategy: merge
partition_by: {
"field": "order_date",
"data_type": "date"
}
cluster_by: ['customer_id', 'status']
tags: ['finance', 'core', 'production']
meta:
owner: data-engineering
team: analytics
cost_center: finance
pii: false
columns:
- name: order_id
description: "Unique order identifier"
data_tests:
- unique
- not_null
Performance Metrics
| Component | Description | Impact |
|---|---|---|
| dbt_project.yml | Project configuration | High |
| packages.yml | Package dependencies | Medium |
| profiles.yml | Connection settings | High |
| Variables | Dynamic configuration | Low |
| Model configs | Model settings | High |
| Tags | Organization | Low |
Best Practices
- Use version control for all configuration
- Never commit secrets - use environment variables
- Separate environments - dev, staging, production
- Document configurations - add comments
- Test configurations - validate before deployment
- Use packages - leverage community code
- Version constraints - specify version ranges
- Clean regularly - remove unused targets
See Also
- dbt Core Architecture β Manifest, DAG, and compilation pipeline
- The ref() Function β Model reference resolution and dependency management
- Jinja Templating in dbt β Template syntax, macros, and dynamic SQL
- Snowflake Architecture β Snowflake cloud data platform fundamentals
- Data Engineering Fundamentals β Modern data stack overview