dbt-audit-helper CI/CD Integration

Running dbt-audit-helper automatically on every pull request catches regressions — unintended output changes — before they reach production, without requiring the PR author to run comparison macros manually.

The Core Pattern

The macro best suited for CI is compare_all_columns. It produces a per-column summary showing perfect_match, null_in_a, null_in_b, missing_from_a, missing_from_b, and conflicting_values counts. Wrap it in a test that fails on any conflicts:

models:
  - name: mrt__finance__revenue
    data_tests:
      - dbt_utils.expression_is_true:
          expression: "conflicting_values = 0"
          where: "1=1"

This test passes when every column in your dev/PR model matches production exactly. Any difference triggers a failure with a clear diagnostic: which columns diverged and by how much.

dbt Cloud CI Jobs

In dbt Cloud, CI jobs triggered on pull requests compile ref() to the PR schema automatically. This means compare_all_columns naturally compares your PR build against production without any extra configuration:

{{ audit_helper.compare_all_columns(
    a_relation=ref('mrt__finance__revenue'),
    b_relation=api.Relation.create(
      database='analytics_db',
      schema='production',
      identifier='mrt__finance__revenue'
    ),
    primary_key_columns=['revenue__id']
) }}

Here, ref('mrt__finance__revenue') resolves to the PR schema while api.Relation.create points at production. The comparison happens on every PR, and the job blocks merging if mismatches are detected.

The key advantage of dbt Cloud’s approach is that schema resolution is handled for you. You don’t need to parameterize the target schema or maintain environment-specific logic. The same macro code works identically in development, CI, and production contexts.

dbt Core with GitHub Actions

For dbt Core projects, use --defer and --state to target only changed models, then run audit tests:

steps:
  - name: Build modified models
    run: |
      dbt build --select state:modified+ --defer --state ./prod-manifest

  - name: Run audit tests
    run: |
      dbt test --select tag:audit --store-failures

The --defer flag tells dbt to use production manifests for any model not being rebuilt in this PR. The --state flag points to a previously saved manifest.json from your production environment. Together, they ensure only modified models are rebuilt while unmodified dependencies resolve to their production versions.

--store-failures persists any mismatches into your warehouse as tables, which you can query after the CI run to investigate what changed and why. This is more useful than log output for complex debugging.

Structuring Audit Tests

Not every model needs an audit test. Focus on models where unintended output changes would have business impact:

Mart models consumed by BI tools or reverse ETL
Models with complex business logic where refactoring risk is highest
Models feeding financial reports where even small numerical differences matter
Models with downstream ML consumers sensitive to distribution shifts

Tag audit-specific tests so they can be run selectively:

models:
  - name: mrt__finance__revenue
    data_tests:
      - dbt_utils.expression_is_true:
          expression: "conflicting_values = 0"
          where: "1=1"
          tags: ["audit"]
          config:
            severity: error

The severity: error ensures CI fails on mismatches rather than just warning. For models where small numerical differences are acceptable (see Floating-Point Precision in Data Comparison), you might use severity: warn or add tolerance logic to the expression.

Handling Expected Changes

Not all output changes are regressions. Sometimes a PR intentionally changes model output — fixing a bug, adding a column, changing a calculation. In these cases, the audit test will fail by design.

Options for handling intentional changes:

Document in the PR that audit failures are expected and explain what changed
Temporarily skip the audit test by removing the tag or using --exclude tag:audit for that specific CI run
Update the test threshold if the change introduces acceptable differences (e.g., rounding change that affects < 0.01% of rows)

Permanently disabling audit tests because they “always fail” is counterproductive. If a test consistently detects real changes, the models are genuinely unstable — the underlying volatility is the problem to address.

Interactive Investigation

When CI detects mismatches, the next step is diagnosis. --store-failures gives you tables to query. For more visual investigation, Hex offers a notebook template that pairs with audit-helper for PR validation workflows — useful for teams that want a richer diagnostic experience than querying stored failure tables directly.

The progressive validation workflow applies here too. Start with the high-level compare_all_columns summary from CI. If it shows conflicts, use compare_which_relation_columns_differ to narrow down, then compare_column_values to understand the specific nature of each column’s mismatches.

Scaling Considerations

audit-helper requires manual setup per model. Unlike tools like Datafold that automatically discover and compare all affected models in a PR, you need to explicitly create a test for each model you want to validate. For projects with a handful of critical mart models, this is fine. For migrations spanning dozens of models, the setup overhead becomes significant.

For enterprise-scale CI validation, the dbt-audit-helper-ext package by Infinite Lambda adds codegen scripts that auto-generate validation macros and threshold-based acceptance criteria (100% = pass, 99%+ = warning, below 99% = fail). This reduces the per-model setup cost for large projects.