Jinja Templating in dbt
Jinja Processing Architecture
Macro Definition Flow
Block Structure
+-----------------------------------------------------------------------------+
| JINJA BLOCK STRUCTURE |
+-----------------------------------------------------------------------------+
| |
| +---------------------------------------------------------------------+ |
| | {% block definition %} | |
| | {% set variables = [] %} | |
| | {% for item in items %} | |
| | {{ item }} | |
| | {% endfor %} | |
| | {% if condition %} | |
| | {{ output }} | |
| | {% endif %} | |
| | {% endblock %} | |
| +---------------------------------------------------------------------+ |
| |
| BLOCK TYPES: |
| * Variable Set: {% set x = value %} |
| * For Loop: {% for i in items %}...{% endfor %} |
| * If/Else: {% if condition %}...{% else %}...{% endif %} |
| * Macro: {% macro name() %}...{% endmacro %} |
| * Block: {% block name %}...{% endblock %} |
| * Call: {% call macro() %}...{% endcall %} |
| * Import: {% import 'file' as alias %} |
| * From: {% from 'file' import macro %} |
| |
+-----------------------------------------------------------------------------+
Detailed Explanation
Jinja2 is the templating engine that powers dbt's dynamic SQL generation capabilities. Understanding Jinja is essential for writing advanced dbt models, creating reusable macros, and building sophisticated data transformations.
What is Jinja Template Syntax?
Jinja templates contain special delimiters that distinguish them from regular SQL:
| Delimiter | Purpose | Example |
|---|---|---|
{{ variable }} | Output the value of a variable | {{ order_date }} |
{% statement %} | Execute Python-like code | {% if condition %}...{% endif %} |
{# comment #} | Jinja comments (stripped during rendering) | {# This is a comment #} |
What data types does Jinja support?
- Strings: Text values wrapped in quotes
- Numbers: Integers and floats
- Lists: Ordered collections using
[item1, item2] - Dictionaries: Key-value pairs using
{'key': 'value'} - Booleans:
TrueorFalse
Variables can be set using the {% set %} tag:
{% set my_variable = 'value' %}
{% set my_list = [1, 2, 3] %}
{% set my_dict = {'key': 'value'} %}
What control structures are available?
For Loops: Iterate over collections
{% for column in columns %}
{{ column }} as {{ column }}_renamed
{% if not loop.last %},{% endif %}
{% endfor %}
Conditionals: Branch logic based on conditions
{% if incremental %}
where updated_at > (select max(updated_at) from {{ this }})
{% endif %}
Macros: Reusable code blocks
{% macro empty_table(columns) %}
select
{% for col in columns %}
cast(null as {{ col.data_type }}) as {{ col.name }}
{% if not loop.last %},{% endif %}
{% endfor %}
{% endmacro %}
What filters are available?
Jinja filters transform values before output:
| Filter | Purpose | Example |
|---|---|---|
upper | Convert to uppercase | {{ value | upper }} |
lower | Convert to lowercase | {{ value | lower }} |
trim | Remove whitespace | {{ value | trim }} |
join | Join list with separator | {{ value | join(', ') }} |
default | Provide default value | {{ value | default('N/A') }} |
length | Get length of string/list | {{ value | length }} |
replace | String replacement | {{ value | replace('old', 'new') }} |
What context variables does dbt provide?
{{ ref('model_name') }}- Reference another model{{ source('source_name', 'table_name') }}- Reference a source{{ var('variable_name') }}- Access project variables{{ config(key='value') }}- Configure model settings{{ this }}- Reference the current model{{ target }}- Access target connection details{{ adapter }}- Access adapter-specific functions
Key Takeaway: Jinja templating enables dynamic SQL generation through variables, control structures, filters, and context variables, making dbt models flexible and reusable.
Code Examples
Dynamic Column Selection
-- models/staging/stg_orders.sql
{% set columns = ['id', 'customer_id', 'status', 'amount', 'created_at'] %}
with source as (
select * from {{ source('raw', 'orders') }}
),
renamed as (
select
{% for col in columns %}
{{ col }}
{% if not loop.last %},{% endif %}
{% endfor %}
from source
)
select * from renamed
Conditional Column Generation
-- models/marts/fct_orders.sql
{% set include_profit = var('include_profit', false) %}
with orders as (
select
order_id,
customer_id,
order_date,
amount,
cost,
{% if include_profit %}
amount - cost as profit,
safe_divide(amount - cost, amount) as profit_margin
{% endif %}
from {{ ref('stg_orders') }}
)
select * from orders
Reusable Macro for Date Spine
-- macros/date_spine.sql
{% macro date_spine(start_date, end_date, datepart="day") %}
{% set n = dbt_utils.datediff(start_date, end_date, datepart) %}
{{ dbt_utils.generate_series(n) }}
select
{{ dbt_utils.dateadd(datepart, "row_number() over (order by 1) - 1", start_date) }} as date_{{ datepart }}
from {{ ref('dbt_utils_generate_series') }}
where row_number() <= {{ n + 1 }}
{% endmacro %}
Advanced Macro with Dispatch
-- macros/cross_db_utils.sql
{% macro safe_cast(column, type) %}
{% set macro = adapter.dispatch('safe_cast', 'dbt_utils')(column, type) %}
{{ return(macro) }}
{% endmacro %}
{% macro default__safe_cast(column, type) %}
safe_cast({{ column }} as {{ type }})
{% endmacro %}
{% macro postgres__safe_cast(column, type) %}
cast({{ column }} as {{ type }})
{% endmacro %}
Dynamic Incremental Strategy
-- models/marts/fct_events.sql
{{
config(
materialized='incremental',
incremental_strategy=var('incremental_strategy', 'merge'),
unique_key='event_id',
partition_by={
"field": "event_date",
"data_type": "date",
"granularity": var('partition_granularity', 'day')
},
cluster_by=var('cluster_columns', ['user_id', 'event_type'])
)
}}
with events as (
select * from {{ ref('stg_events') }}
),
final as (
select
event_id,
user_id,
event_type,
event_timestamp,
cast(event_timestamp as date) as event_date,
event_properties,
current_timestamp() as dbt_updated_at
from events
)
select * from final
{% if is_incremental() %}
where dbt_updated_at > (select max(dbt_updated_at) from {{ this }})
{% endif %}
Performance Metrics
| Operation | Complexity | Use Case | Example |
|---|---|---|---|
| Variable Set | O(1) | Store values | {% set x = 1 %} |
| For Loop | O(n) | Iterate lists | {% for i in items %} |
| Conditional | O(1) | Branch logic | {% if condition %} |
| Macro Call | O(1) | Reusable code | {{ macro() }} |
| Filter | O(1) | Transform values | {{ x | upper }} |
| Include | O(1) | Import templates | {{ include 'file' }} |
Best Practices
- Use descriptive variable names for clarity and maintainability
- Keep macros small and focused on single responsibilities
- Use filters to transform values rather than inline logic
- Document macros with docstrings and comments
- Use
{% set %}for complex expressions to improve readability - Avoid deep nesting in control structures
- Use
default()filter to provide fallback values - Test macros with different input combinations
See Also
- The ref() Function β Model reference resolution and dependency management
- dbt Core Architecture β Compilation pipeline and manifest structure
- dbt Project Configuration β Project setup, packages, and profiles
- PySpark Basics β Distributed data processing with Spark
- Data Engineering Fundamentals β Modern data stack overview