πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

dbt Custom Macros

🟒 Free Lesson

Advertisement

dbt Custom Macros

Custom Macro Architecture

Macro Type Pipeline

Formal Definitions

DfMacro

A macro in dbt is a reusable block of Jinja and SQL code defined in .sql files under the macros/ directory. Macros accept arguments, process them through Jinja templating, and return SQL strings. They are invoked in models using {{ macro_name(arg1, arg2) }} syntax. Macros enable DRY (Don't Repeat Yourself) principles by encapsulating common transformation patterns.

DfMacro Namespace

The macro namespace is the resolution order for macro names. dbt searches for macros in this order: project macros > package macros > dispatched macros. This allows packages to provide default implementations that projects can override. The namespace ensures that custom macros take precedence over package-provided macros.

DfDispatch Macro

A dispatch macro enables adapter-specific implementations of a generic macro. The dispatch() function looks for a macro with a specific suffix (e.g., _bigquery, _snowflake) before falling back to the default implementation. This allows the same macro to behave differently across database platforms while maintaining a consistent interface.

DfOperation Macro

An operation macro is a macro designed to be called with dbt run-operation macro_name. Unlike SQL macros that return SQL, operation macros execute arbitrary logic: querying metadata, creating objects, sending notifications, or running DDL. They are invoked from the CLI, not from within models.

Detailed Explanation


What are dbt Custom Macros?

dbt macros are the primary mechanism for code reuse and abstraction. They range from simple SQL snippets to complex Jinja logic that generates platform-specific queries.


Macro Categories

CategoryPurposeInvocation
SQL MacrosReturn SQL fragments for use in models{{ macro() }} in models
Operation MacrosExecute side effects from the CLIdbt run-operation
Generic Test MacrosDefine custom data test logicYAML data_tests:
Materialization MacrosOverride default materialization behaviorconfig(materialized=...)
Dispatch MacrosProvide adapter-specific implementationsadapter.dispatch()

Key Takeaway: Use return() to output SQL from macros. Use run_query() for DDL/DML in operation macros. Always test macros with dbt run-operation before deployment.

Macros can access dbt's context variables: this, ref(), source(), var(), target, env_var(), and all adapter methods. This makes macros extremely powerful for generating dynamic SQL based on project configuration and environment.

When building custom macros, always use return() to output SQL. Without an explicit return, Jinja's default behavior may produce unexpected results. For operation macros, use run_query() to execute DDL/DML and log() for debugging output.

Code Examples

Basic SQL Macro

-- macros/generate_schema_name.sql
{% macro generate_schema_name(custom_schema_name, node) -%}
    {%- set default_schema = target.schema -%}
    {%- if custom_schema_name is none -%}
        {{ default_schema }}
    {%- else -%}
        {{ default_schema }}_{{ custom_schema_name | trim }}
    {%- endif -%}
{%- endmacro %}

Macro with Arguments

-- macros/cast_column.sql
{% macro cast_column(column_name, target_type) %}
    {%- if target.type == 'bigquery' %}
        cast({{ column_name }} as {{ target_type }})
    {%- elif target.type == 'snowflake' %}
        cast({{ column_name }} as {{ target_type }})
    {%- elif target.type == 'redshift' %}
        cast({{ column_name }} as {{ target_type }})
    {%- else %}
        cast({{ column_name }} as {{ target_type }})
    {%- endif %}
{% endmacro %}

-- Usage in model:
-- select {{ cast_column('amount', 'numeric(10,2)') }} as amount

Complex Transformation Macro

-- macros/generate_dimensions.sql
{% macro generate_dimensions(columns, prefix='dim') %}
    {%- for col in columns %}
        {{ prefix }}_{{ col }} as {{ col }}
        {%- if not loop.last %},{% endif %}
    {%- endfor %}
{% endmacro %}

-- macros/generate_measures.sql
{% macro generate_measures(measures, agg='sum') %}
    {%- for measure in measures %}
        {{ agg }}({{ measure }}) as {{ agg }}_{{ measure }}
        {%- if not loop.last %},{% endif %}
    {%- endfor %}
{% endmacro %}

-- Usage in model:
-- select
--     {{ generate_dimensions(['product_id', 'product_name', 'category']) }},
--     {{ generate_measures(['quantity', 'revenue']) }}
-- from stg_orders

Operation Macro

-- macros/refresh_all_snapshots.sql
{% macro refresh_all_snapshots() %}
    {%- set snapshot_models = graph.nodes.values() | selectattr('resource_type', 'equalto', 'snapshot') -%}
    
    {%- for snapshot in snapshot_models %}
        {%- set snapshot_sql %}
            snapshot {{ snapshot.unique_id }}
        {%- endset -%}
        
        {{ log("Refreshing snapshot: " ~ snapshot.name, info=True) }}
        {% do run_query(snapshot_sql) %}
    {%- endfor %}
    
    {{ log("All snapshots refreshed successfully", info=True) }}
{% endmacro %}

-- Run with: dbt run-operation refresh_all_snapshots

Custom Generic Test Macro

-- macros/test_accepted_range.sql
{% test accepted_range(model, column_name, min_value, max_value) %}
    select
        '{{ model }}' as model_name,
        '{{ column_name }}' as column_name,
        count(*) as failures
    from {{ model }}
    where {{ column_name }} < {{ min_value }}
       or {{ column_name }} > {{ max_value }}
{% endtest %}

-- Usage in YAML:
-- data_tests:
--   - accepted_range:
--       min_value: 0
--       max_value: 1000000

Dispatch Macro

-- macros/cross_db_utils/cast_as_bigint.sql
{% macro cast_as_bigint(column) %}
    {{ return(adapter.dispatch('cast_as_bigint')(column)) }}
{% endmacro %}

{% macro default__cast_as_bigint(column) %}
    cast({{ column }} as bigint)
{% endmacro %}

{% macro bigquery__cast_as_bigint(column) %}
    safe_cast({{ column }} as int64)
{% endmacro %}

{% macro snowflake__cast_as_bigint(column) %}
    cast({{ column }} as bigint)
{% endmacro %}

{% macro redshift__cast_as_bigint(column) %}
    cast({{ column }} as bigint)
{% endmacro %}

Macro for Data Migration

-- macros/backfill_column.sql
{% macro backfill_column(target_table, source_table, column_name, key_column) %}
    {%- set sql -%}
        update {{ target_table }} as t
        set {{ column_name }} = s.{{ column_name }}
        from {{ source_table }} as s
        where t.{{ key_column }} = s.{{ key_column }}
          and t.{{ column_name }} is null
    {%- endset -%}
    
    {{ log("Executing backfill for column: " ~ column_name, info=True) }}
    {% do run_query(sql) %}
    {{ log("Backfill complete for column: " ~ column_name, info=True) }}
{% endmacro %}

-- Run with: dbt run-operation backfill_column --args '{"target_table": "analytics.fct_orders", "source_table": "staging.stg_orders", "column_name": "customer_segment", "key_column": "order_id"}'

Macro for Documentation

-- macros/generate_column_description.sql
{% macro generate_column_description(column_name, description, data_type, nullable=true, primary_key=false) %}
    {%- set parts = [] -%}
    {%- do parts.append(description) -%}
    {%- if primary_key -%}
        {%- do parts.append("Primary key") -%}
    {%- endif -%}
    {%- if not nullable -%}
        {%- do parts.append("NOT NULL") -%}
    {%- endif -%}
    {%- do parts.append("Type: " ~ data_type) -%}
    
    {{ parts | join(' | ') }}
{% endmacro %}

Macro Reference Table

Macro TypeReturn TypeInvocationUse Case
SQLSQL string{{ macro() }} in modelsReusable SQL fragments
OperationNone (side effects)dbt run-operationCLI automation
Generic TestSQL (test query)YAML data_tests:Custom test logic
MaterializationSQL/DMLconfig(materialized=...)Custom materializations
DispatchSQL stringadapter.dispatch()Cross-database compatibility

Best Practices

  1. Naming conventions - Use snake_case for macro names
  2. Documentation - Add docstrings at the top of each macro
  3. Argument validation - Check required arguments before processing
  4. Error handling - Use exceptions.warn() for non-fatal errors
  5. Testing - Test macros with dbt run-operation before deployment
  6. Versioning - Include macros in package versioning
  7. Cross-database - Use dispatch for database-specific logic
  8. Performance - Avoid unnecessary Jinja loops in large datasets

See Also

⭐

Premium Content

dbt Custom Macros

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert dbt Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement