Writing Reusable Macros: DRY Principles for dbt

You’ve hit the point where copy-pasting SQL between models feels wrong. The same date truncation logic appears in five places. The same null-handling pattern keeps showing up. Your instinct says “make a macro,” and that instinct is worth questioning.

Most dbt projects have too many macros, not too few. The drive to eliminate duplication creates abstractions that obscure what the SQL actually does. A model that requires reading four macro files just to understand a single query is unreadable. Good macro design creates code that your future self (and your teammates) can maintain, not just code with fewer repeated lines. If you’re looking for macros to adopt rather than build, see essential macros every dbt project needs. For the fundamentals of Jinja in dbt, see the dbt macros guide.

The rule of three

The first time you write a pattern, just write it. Inline. No macro.

The second time, note the pattern. Maybe add a comment: “Similar logic in base__shopify__orders.” Still no macro.

The third time, you have enough information to extract something useful. You’ve seen how the pattern actually gets used. You know which parts vary and which stay constant. Now you can build an abstraction that fits reality instead of guessing at what you might need.

This rule comes from software engineering practice, but dbt’s own documentation endorses it: “Favor readability when mixing Jinja with SQL, even if it means repeating some lines.”

In practice, the progression looks like this:

-- First time: inline
SELECT
    order_id,
    ROUND(amount_cents / 100.0, 2) AS order__amount_dollars
FROM {{ ref('base__shopify__orders') }}

-- Second time: still inline, but you notice the pattern
SELECT
    payment_id,
    ROUND(amount_cents / 100.0, 2) AS payment__amount_dollars
FROM {{ ref('base__stripe__payments') }}

-- Third time: now extract the macro
{% macro cents_to_dollars(column_name, scale=2) %}
    ROUND({{ column_name }} / 100.0, {{ scale }})
{% endmacro %}

The temptation is to skip ahead. You see the pattern on day one and think, “I’ll definitely use this again.” Sometimes you’re right. More often, the second and third uses have subtle differences that break your original abstraction. Waiting lets you build the right abstraction instead of the imagined one.

The cost of over-abstraction

Premature macros create three problems:

Reduced readability. Every macro reference is a mental context switch. The reader has to open another file, understand the macro’s logic, then return to the model. With inline code, everything is visible in one place.

Increased complexity. To handle all the cases you imagined, you add parameters. Each parameter is a decision point. A macro with seven parameters is harder to use correctly than inline code would have been.

Fragile code. When a macro serves five models, changes get scary. You want to fix the behavior for one model but can’t risk breaking the others. So you add another parameter, making the complexity problem worse.

The rule of three is about gathering enough information to build the right abstraction, not just counting occurrences.

Single responsibility macros

Once you’ve decided a macro is worth creating, keep it focused on one job.

This macro tries to do too much:

{% macro process_amount(column, apply_discount=false, discount_rate=0.1, convert_currency=false, target_currency='USD') %}
    {% set result = column %}
    {% if apply_discount %}
        {% set result = result ~ ' * (1 - ' ~ discount_rate ~ ')' %}
    {% endif %}
    {% if convert_currency %}
        {% set result = result ~ ' * get_exchange_rate(\'' ~ target_currency ~ '\')' %}
    {% endif %}
    {{ result }}
{% endmacro %}

This macro has five parameters and combines three different operations. Using it requires understanding all the flags, testing it means covering every combination, and changing the discount logic risks breaking currency conversion.

Break it into focused macros instead:

{% macro cents_to_dollars(column_name, scale=2) %}
    ROUND({{ column_name }} / 100.0, {{ scale }})
{% endmacro %}

{% macro apply_discount(amount_column, discount_rate) %}
    {{ amount_column }} * (1 - {{ discount_rate }})
{% endmacro %}

{% macro convert_currency(amount_column, target_currency) %}
    {{ amount_column }} * {{ get_exchange_rate(target_currency) }}
{% endmacro %}

Each macro does one thing, and the name explains what. Composing them in a model is explicit:

SELECT
    order_id,
    {{ convert_currency(
        apply_discount(cents_to_dollars('amount_cents'), 0.1),
        'EUR'
    ) }} AS order__discounted_amount_eur
FROM {{ ref('base__shopify__orders') }}

Yes, the model code is longer than a single process_amount call would be. But anyone reading it can see exactly what’s happening. When finance asks “how do we calculate discounted amounts in EUR?”, the answer is right there in the SQL.

Warning sign: parameter explosion. When a macro grows past five or six parameters, it’s probably doing too much. Look for ways to split it. If splitting doesn’t work cleanly, the abstraction might not be the right one.

Naming conventions that scale

A macro’s name is its documentation at the call site. Make it count.

Use verb prefixes for actions:

get_ for macros that retrieve data: get_column_values, get_latest_partition
generate_ for macros that create SQL: generate_surrogate_key, generate_schema_name
format_ for macros that transform output: format_timestamp, format_currency

Use descriptive names for transformations:

cents_to_dollars over convert_amount
extract_domain_from_email over parse_email
calculate_days_since over date_diff_helper

The name should tell you what the macro does without opening the file.

Use underscore prefix for internal helpers:

{% macro _build_join_condition(columns) %}
    {# Internal helper, not for direct use in models #}
{% endmacro %}

{% macro generate_merge_statement(source, target, columns) %}
    {# Uses _build_join_condition internally #}
{% endmacro %}

The underscore signals “don’t call this directly.” Your team can use generate_merge_statement without understanding the internals.

One macro per file, filename matches macro name. When you need cents_to_dollars, you know it lives in cents_to_dollars.sql. No hunting. No scrolling through a 500-line utils.sql file. This rule comes from the Brooklyn Data style guide, and it’s one of the most practical conventions you can adopt.

Folder structure and organization

Macros need a logical home. This structure scales well:

macros/
├── _macros.yml              # Documentation for all macros
├── generate_schema_name.sql # dbt override macros at root
├── generate_alias_name.sql
├── utils/
│   ├── cents_to_dollars.sql
│   ├── limit_data_in_dev.sql
│   └── add_audit_columns.sql
├── transformations/
│   ├── finance/
│   │   ├── calculate_mrr.sql
│   │   └── prorate_amount.sql
│   └── marketing/
│       ├── attribution_weight.sql
│       └── channel_grouping.sql
└── tests/
    └── test_row_count_match.sql

Override macros live at the root. dbt’s generate_schema_name, generate_alias_name, and generate_database_name are special. Keeping them at macros/ makes them easy to find and signals their importance.

Utility macros get their own folder. Generic helpers like add_audit_columns or limit_data_in_dev belong in utils/. These are the macros any model might use.

Domain-specific macros group by business area. Finance has its own logic. Marketing has its own logic. Grouping by domain makes it easier to find relevant macros and to assign ownership.

Custom generic tests go in tests/. When you write a test macro that can apply to any column (like test_row_count_match), put it with the other tests.

The exact structure matters less than consistency. Pick a pattern, document it, and stick to it. For the broader project layout, see my guide to dbt project structure.

Documenting macros

Documentation in _macros.yml beats comments in SQL files. YAML documentation shows up in dbt docs, supports structured argument definitions, and creates a single source of truth.

version: 2

macros:
  - name: cents_to_dollars
    description: |
      Converts an integer cents column to a decimal dollars value.

      ## Usage
      ```sql
      {{ cents_to_dollars('amount_cents') }}
      {{ cents_to_dollars('amount_cents', scale=4) }}
      ```

      ## Notes
      - Assumes input is already an integer (no validation)
      - Uses ROUND for consistent decimal places
    arguments:
      - name: column_name
        type: string
        description: The column containing cents values
      - name: scale
        type: integer
        description: Number of decimal places. Defaults to 2.

  - name: limit_data_in_dev
    description: |
      Adds a date filter in development environments to limit data volume.
      Returns empty string in production.

      ## Usage
      Add to WHERE clause with AND:
      ```sql
      WHERE 1=1
          {{ limit_data_in_dev('created_at', 7) }}
      ```
    arguments:
      - name: column_name
        type: string
        description: Date column to filter on. Defaults to 'created_at'.
      - name: days
        type: integer
        description: Number of days to include. Defaults to 3.

The description includes a usage example because that’s what developers actually need. They want to copy-paste something that works, then modify it. Abstract argument descriptions don’t help as much as a concrete example.

Testing macro behavior

Macros need tests like any other code, and they fit naturally into your broader testing strategy. dbt gives you two approaches.

Integration tests create models that use your macros and verify the output:

-- models/tests/test_cents_to_dollars.sql
{{ config(materialized='table', schema='dbt_tests') }}

WITH test_data AS (
    SELECT 1000 AS amount_cents, 10.00 AS expected
    UNION ALL
    SELECT 999 AS amount_cents, 9.99 AS expected
    UNION ALL
    SELECT 1 AS amount_cents, 0.01 AS expected
)

SELECT
    amount_cents,
    {{ cents_to_dollars('amount_cents') }} AS actual,
    expected,
    {{ cents_to_dollars('amount_cents') }} = expected AS passed
FROM test_data

Run dbt build --select test_cents_to_dollars and check that all rows have passed = true.

Unit tests (dbt 1.8+) let you test SQL logic before materialization:

unit_tests:
  - name: test_cents_to_dollars_conversion
    model: mrt__finance__orders
    given:
      - input: ref('base__shopify__orders')
        rows:
          - {order_id: 1, amount_cents: 1000}
          - {order_id: 2, amount_cents: 50}
    expect:
      rows:
        - {order_id: 1, order__amount_dollars: 10.00}
        - {order_id: 2, order__amount_dollars: 0.50}

This tests the macro in context, as part of a real model transformation. For a deeper dive, see the complete guide to dbt unit testing.

Check compiled output. When debugging, run dbt compile --select model_name and look at target/compiled/. The compiled SQL is the truth; if it looks wrong, the macro is wrong.

Handling breaking changes

Sometimes you need to change how a macro works, and the change will affect existing behavior.

The dbt-utils package demonstrates the pattern:

Create a new macro with the improved behavior and a clear name
Keep the old macro but add a deprecation warning using exceptions.warn()
Document the migration path in your changelog or README
Remove the old macro in a major version

{% macro cents_to_dollars_v2(column_name, scale=2, round_mode='half_up') %}
    {# New version with configurable rounding #}
{% endmacro %}

{% macro cents_to_dollars(column_name, scale=2) %}
    {{ exceptions.warn("cents_to_dollars is deprecated. Use cents_to_dollars_v2 instead.") }}
    ROUND({{ column_name }} / 100.0, {{ scale }})
{% endmacro %}

For internal projects, you might not need this formality. If your team is small and you can update all usages in one PR, just change the macro and fix the call sites. Versioning matters more when macros are shared across teams or published as packages.

Communicate changes clearly. A Slack message, a PR description, a note in your project’s changelog, whatever works for your team. Surprise behavior changes cause more friction than the change itself.

Start with restraint

The best macro code is often no macro at all. Inline SQL, repeated a few times, gives you the information you need to build the right abstraction when it finally earns one. A focused macro with a clear name and good documentation pays for itself; a premature one just adds indirection.

Your goal is building a dbt project that your team can understand and maintain. Sometimes that means three similar lines of code instead of one macro call. That’s good judgment, not failure.