Jinja Templating for SQL Practitioners

Jinja is dbt’s templating engine, originally designed for generating HTML in Python web frameworks like Flask and Django. The same mechanism that generates <ul> tags from a list generates SUM(CASE WHEN ...) columns from a list of payment methods. The syntax is also standard in Ansible playbooks.

Jinja treats templating as a distinct concern from programming. The model is SQL with variables, not a program that emits SQL strings. Analytics engineers who came up through SQL generally find this easier to adopt than JavaScript-based alternatives.

The Two-Delimiter Model

Jinja uses two delimiter types that cover everything you need:

{{ expression }} — evaluates and outputs a value. This is where you put references, variables, and function calls:

SELECT customer_id, email
FROM {{ ref('base__stripe__customers') }}
WHERE status = {{ var('active_status') }}

{% statement %} — controls logic without producing output. Conditionals, loops, and variable assignments live here:

{% if target.name == 'prod' %}
    AND created_at > CURRENT_DATE - 90
{% endif %}

{% set payment_methods = ['bank_transfer', 'credit_card', 'gift_card'] %}
{% for method in payment_methods %}
    SUM(CASE WHEN payment_method = '{{ method }}' THEN amount END) AS {{ method }}_amount
    {{ ',' if not loop.last }}
{% endfor %}

That is the entire mental model. Two delimiter types. One for output, one for logic. You can learn Jinja sufficiently for dbt in an afternoon.

Compare this to Dataform’s JavaScript approach, where the same loop looks like:

${["bank_transfer", "credit_card", "gift_card"]
    .map(method => `SUM(CASE WHEN payment_method = '${method}' THEN amount END) AS ${method}_amount`)
    .join(", ")}

Both produce identical SQL. The Jinja version requires knowing what {% for %} and loop.last do. The JavaScript version requires knowing array .map(), arrow functions, template literals, and .join(). For someone who writes SQL daily and JavaScript occasionally, the Jinja version is easier to read a month later.

Macros as SQL Helpers

The analogy that clicks for SQL practitioners: Jinja macros are like SQL functions, except they run at compile time and return SQL text instead of query results.

A SQL function:

CREATE FUNCTION cents_to_dollars(cents INT64)
RETURNS FLOAT64 AS (ROUND(cents / 100.0, 2));

The equivalent Jinja macro:

{% macro cents_to_dollars(column_name) %}
    ROUND({{ column_name }} / 100.0, 2)
{% endmacro %}

Called in a model:

SELECT
    order_id,
    {{ cents_to_dollars('amount_cents') }} AS order__amount_dollars
FROM {{ ref('base__stripe__orders') }}

The macro renders at compile time and the warehouse sees:

SELECT
    order_id,
    ROUND(amount_cents / 100.0, 2) AS order__amount_dollars
FROM analytics.base__stripe__orders

This pattern keeps macros feeling like SQL helpers. You are not writing a program that generates SQL — you are parameterizing a snippet of SQL. The intent is transparent; the behavior is predictable.

A slightly more involved example — a macro for extracting GA4 event parameters, which requires the same UNNEST pattern on every model:

{% macro unnest_event_param(param_name, value_type='string_value') %}
(SELECT value.{{ value_type }}
 FROM UNNEST(event_params)
 WHERE key = '{{ param_name }}')
{% endmacro %}

In a model:

SELECT
    event_id,
    {{ unnest_event_param('page_location') }} AS page_location,
    {{ unnest_event_param('session_id') }} AS session_id,
    {{ unnest_event_param('ga_session_number', 'int_value') }} AS session_number
FROM {{ ref('base__ga4__events') }}

Any SQL practitioner can read this and understand it. The macro hides the boilerplate UNNEST syntax without hiding the intent.

The equivalent in Dataform is a JavaScript function in an includes/ file:

function unnestEventParam(paramName, valueType = 'string_value') {
    return `(SELECT value.${valueType}
             FROM UNNEST(event_params)
             WHERE key = '${paramName}')`;
}
module.exports = { unnestEventParam };

Both approaches work. Neither has a material advantage for simple helpers like this. The Jinja version integrates directly with dbt’s macro system and is callable from any model without import statements. The JavaScript version requires module.exports and explicit imports in consuming files.

Separation of Concerns via YAML

One aspect of dbt’s Jinja approach that analytics engineers often appreciate: the SQL file focuses on the transformation, not the metadata.

A dbt model:

-- models/marts/mrt__stripe__customers.sql
{{ config(
    materialized='table',
    schema='marts'
) }}

SELECT
    customer_id,
    email,
    {{ cents_to_dollars('lifetime_value_cents') }} AS lifetime_value,
    created_at
FROM {{ ref('int__stripe__customers_enriched') }}

The tests, column descriptions, and data quality assertions live separately in a YAML file:

models:
  - name: mrt__stripe__customers
    description: "One row per customer with current billing status and lifetime value."
    columns:
      - name: customer_id
        tests:
          - unique
          - not_null
      - name: email
        tests:
          - not_null

Compare to Dataform’s SQLX, where everything coexists in a single file:

config {
    type: "table",
    schema: "marts",
    assertions: {
        uniqueKey: ["customer_id"],
        nonNull: ["customer_id", "email"]
    }
}

SELECT
    customer_id,
    email,
    ROUND(lifetime_value_cents / 100.0, 2) AS lifetime_value,
    created_at
FROM ${ref("int__stripe__customers_enriched")}

Neither layout is objectively better. Dataform’s all-in-one approach appeals to engineers who prefer everything visible in one place. dbt’s split appeals to teams who want transformation logic in SQL files and data quality logic in YAML files — two different reviewers, two different concerns, two different files.

Practically, dbt’s YAML-based tests become a significant advantage as the testing ecosystem grows. Adding dbt-expectations or elementary tests to the YAML requires no changes to the SQL file. See Dataform Testing Limitations for the full comparison of what each ecosystem provides.

The Mature Package Ecosystem

One practical benefit that follows from Jinja’s adoption across the dbt community: a large library of pre-built macros that work in any dbt project.

dbt-utils is the canonical example. It provides:

generate_surrogate_key — hashed composite keys with consistent null handling
star — column-list generation with exclusions
get_column_values — dynamic SQL based on actual data
date_spine — continuous date sequences for time-series gaps
union_relations — combining tables with different schemas

None of these require writing custom Jinja. They are installable in packages.yml and immediately available as {{ dbt_utils.generate_surrogate_key(['customer_id', 'order_date']) }}.

The equivalent in Dataform is building each utility from scratch as a JavaScript function. The community package ecosystem for Dataform is orders of magnitude smaller. For teams that expect to use standardized analytics engineering patterns — and most do — this asymmetry matters.

When the Constraint Becomes Visible

Jinja’s SQL-practitioner-friendliness is real, but so is its ceiling. The constraint appears when you need to generate models programmatically.

Jinja runs inside a single .sql file and generates a single SQL statement. It cannot create new files, new models, or new DAG nodes. If you need 50 country-specific versions of the same model, Jinja’s answer is: write 50 files, or find a workaround. JavaScript’s answer is: write a loop.

For most analytics engineering projects, this ceiling is never hit. Standard dimensional modeling — base models, intermediate joins, mart aggregations — does not require dynamic model generation. Jinja handles it comfortably. The programmatic model generation pattern is genuinely powerful, but most teams do not need it.

If a project needs dynamic model generation, the constraint is real. If it does not, Jinja’s approachability for SQL practitioners is a practical advantage.