πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

dbt Best Practices

🟒 Free Lesson

Advertisement

dbt Best Practices

Best Practices Architecture

Naming Conventions

Project Structure

Architecture Diagram
+-----------------------------------------------------------------------------+
|                     PROJECT STRUCTURE BEST PRACTICES                        |
+-----------------------------------------------------------------------------+
|                                                                             |
|  +---------------------------------------------------------------------+    |
|  |                    RECOMMENDED STRUCTURE                             |   |
|  |                                                                     |    |
|  |  my_project/                                                       |     |
|  |  +-- dbt_project.yml                                               |     |
|  |  +-- packages.yml                                                  |     |
|  |  +-- profiles.yml                                                  |     |
|  |  |                                                                  |    |
|  |  +-- models/                                                       |     |
|  |  |   +-- staging/                                                  |     |
|  |  |   |   +-- _sources.yml        β—„-- Source definitions            |     |
|  |  |   |   +-- stg_customers.sql                                    |      |
|  |  |   |   +-- stg_orders.sql                                       |      |
|  |  |   |   +-- stg_products.sql                                     |      |
|  |  |   |                                                            |      |
|  |  |   +-- intermediate/                                             |     |
|  |  |   |   +-- int_orders_joined.sql                                |      |
|  |  |   |   +-- int_orders_aggregated.sql                            |      |
|  |  |   |   +-- int_orders_cleaned.sql                               |      |
|  |  |   |                                                            |      |
|  |  |   +-- marts/                                                   |      |
|  |  |       +-- finance/                                              |     |
|  |  |       |   +-- fct_orders.sql                                   |      |
|  |  |       |   +-- dim_customers.sql                                |      |
|  |  |       |   +-- fct_revenue.sql                                  |      |
|  |  |       |                                                        |      |
|  |  |       +-- marketing/                                            |     |
|  |  |       |   +-- fct_campaigns.sql                                |      |
|  |  |       |   +-- dim_campaigns.sql                                |      |
|  |  |       |                                                        |      |
|  |  |       +-- product/                                              |     |
|  |  |           +-- fct_events.sql                                   |      |
|  |  |           +-- dim_users.sql                                    |      |
|  |  |                                                                |      |
|  |  +-- seeds/                    β—„-- CSV seed data                  |      |
|  |  +-- snapshots/                β—„-- SCD snapshots                  |      |
|  |  +-- tests/                    β—„-- Custom data tests              |      |
|  |  +-- macros/                   β—„-- Reusable macros                |      |
|  |  +-- analysis/                 β—„-- Ad-hoc analysis                |      |
|  |  +-- docs/                     β—„-- Documentation                  |      |
|  +---------------------------------------------------------------------+    |
|                                                                             |
+-----------------------------------------------------------------------------+

Testing Strategy

Detailed Explanation


What are dbt Best Practices?

Best practices in dbt ensure maintainable, scalable, and reliable data transformations.


Code Style

  1. Consistent formatting β€” Use consistent indentation and spacing
  2. Descriptive naming β€” Clear, meaningful names for all objects
  3. Modular code β€” Small, focused models and macros
  4. Documentation β€” Comprehensive descriptions and examples

Project Structure

LayerPurposeExample
Staging1:1 with source tablesstg_orders
IntermediateBusiness logic joinsint_orders_joined
MartsFinal business datasetsfct_orders
  • Group by business domain (finance, marketing, product)
  • Apply patterns consistently across all layers
  • Maintain clear boundaries between layers

Testing Strategy

  1. Test everything β€” Models, sources, macros
  2. Automate tests β€” Run tests in CI/CD
  3. Monitor results β€” Track test pass/fail rates
  4. Alert on failures β€” Notify on test failures

Key Takeaway: Follow the layered architecture (staging β†’ intermediate β†’ marts) and test at every layer. This creates a reliable, maintainable data pipeline.

Code Examples

Model Documentation Template

# models/marts/fct_orders.yml
version: 2

models:
  - name: fct_orders
    description: >
      Fact table containing all order transactions. This is the central
      fact table for the order analytics domain.
      
      **Grain**: One row per order
      
      **Key Business Questions**:
      - What is our total revenue?
      - How many orders do we process?
      - What is the average order value?
      
      **Data Source**: Shopify via Fivetran
      **Refresh Frequency**: Hourly
      **Owner**: Data Engineering Team
    
    config:
      tags: ['finance', 'core', 'production']
      meta:
        owner: data-engineering
        team: analytics
        cost_center: finance
    
    columns:
      - name: order_id
        description: "Unique identifier for each order"
        data_tests:
          - unique
          - not_null
        meta:
          system: shopify
          pii: false
      
      - name: customer_id
        description: "Foreign key to dim_customers"
        data_tests:
          - not_null
          - relationships:
              to: ref('dim_customers')
              field: customer_id
        meta:
          pii: false
      
      - name: order_date
        description: "Date when the order was placed"
        data_tests:
          - not_null
        meta:
          format: YYYY-MM-DD

Standardized Staging Model

-- models/staging/stg_orders.sql
{{
    config(
        materialized='view',
        schema='staging'
    )
}}

with source as (
    select * from {{ source('shopify', 'orders') }}
),

renamed as (
    select
        -- Primary keys
        id as order_id,
        customer_id,
        
        -- Dimensions
        status as order_status,
        financial_status,
        fulfillment_status,
        
        -- Measures
        {{ dbt_utils.money_snapshot(amount) }} as order_amount,
        {{ dbt_utils.money_snapshot(total_discounts) }} as discount_amount,
        {{ dbt_utils.money_snapshot(total_tax) }} as tax_amount,
        
        -- Timestamps
        created_at as order_created_at,
        updated_at as order_updated_at,
        
        -- Metadata
        _fivetran_synced as synced_at
    from source
)

select * from renamed

Standardized Fact Model

-- models/marts/fct_orders.sql
{{
    config(
        materialized='incremental',
        unique_key='order_id',
        incremental_strategy='merge',
        partition_by={
            "field": "order_date",
            "data_type": "date"
        },
        cluster_by=['customer_id', 'order_status'],
        tags=['finance', 'core', 'production']
    )
}}

with orders as (
    select * from {{ ref('stg_orders') }}
),

customers as (
    select * from {{ ref('dim_customers') }}
),

order_items as (
    select * from {{ ref('stg_order_items') }}
),

final as (
    select
        -- Primary key
        orders.order_id,
        
        -- Foreign keys
        orders.customer_id,
        
        -- Dimensions
        customers.customer_name,
        customers.segment as customer_segment,
        orders.order_status,
        orders.order_date,
        
        -- Measures
        orders.order_amount,
        orders.discount_amount,
        orders.tax_amount,
        count(order_items.item_id) as item_count,
        
        -- Timestamps
        orders.order_created_at,
        orders.order_updated_at,
        current_timestamp() as dbt_updated_at
    from orders
    left join customers on orders.customer_id = customers.customer_id
    left join order_items on orders.order_id = order_items.order_id
    group by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
)

select * from final

{% if is_incremental() %}
where dbt_updated_at > (select max(dbt_updated_at) from {{ this }})
{% endif %}

Testing Template

# models/marts/fct_orders_tests.yml
version: 2

models:
  - name: fct_orders
    tests:
      - dbt_utils.unique_combination_of_columns:
          combination_of_columns:
            - order_id
      
      - dbt_utils.accepted_range:
          min_value: 0
          max_value: 1000000
          column_name: order_amount
    
    columns:
      - name: order_id
        data_tests:
          - unique
          - not_null
      
      - name: customer_id
        data_tests:
          - not_null
          - relationships:
              to: ref('dim_customers')
              field: customer_id
      
      - name: order_amount
        data_tests:
          - not_null
          - dbt_utils.accepted_range:
              min_value: 0
              max_value: 1000000
      
      - name: order_date
        data_tests:
          - not_null
          - dbt_utils.expression_is_true:
              expression: "order_date <= current_date()"

Macro Template

-- macros/generate_generic_test.sql
{% macro generate_generic_test(test_name, model, column_name, config) %}
    {% set test_sql %}
        select
            '{{ test_name }}' as test_name,
            '{{ model }}' as model_name,
            '{{ column_name }}' as column_name,
            count(*) as failures
        from {{ model }}
        where {{ column_name }} is null
    {% endset %}
    
    {% set result = run_query(test_sql) %}
    
    {{ return(result) }}
{% endmacro %}

Performance Metrics

MetricDescriptionTarget
Code QualityLinting and style checks100% pass
Test CoveragePercentage of models tested>90%
Documentation CoveragePercentage documented>95%
Build Success RatePercentage of successful builds>99%
Average Build TimeTime to build all models&lt;30min

Best Practices

  1. Follow naming conventions - Consistent, descriptive names
  2. Use layered architecture - staging -> intermediate -> marts
  3. Test everything - Models, sources, macros
  4. Document comprehensively - Descriptions, lineage, examples
  5. Use version control - All code in Git
  6. Implement CI/CD - Automated testing and deployment
  7. Monitor performance - Track metrics continuously
  8. Review code regularly - Peer reviews and audits

See Also

⭐

Premium Content

dbt Best Practices

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert dbt Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement