ServicesAboutNotesContact Get in touch →
EN FR
Note

CI/CD Data Quality Testing in dbt

How to integrate data quality testing into CI/CD pipelines — Slim CI with state:modified+, GitHub Actions workflows, and tools like Datafold and Recce for regression detection.

Planted
dbtdata qualitytestingautomation

Testing in production catches problems after they reach the warehouse. Testing in CI prevents them from getting there. An effective data quality strategy uses both: production tests as a safety net, CI tests as a gate.

Slim CI: The Foundation

dbt Cloud’s Slim CI pattern is the cornerstone of CI-based data quality testing. It uses manifest comparison to identify which models have changed and runs only those models plus their downstream dependencies:

Terminal window
dbt build --select state:modified+ --defer --state ./

Three flags make this work:

  • state:modified+ selects models that have been modified in the current branch plus everything downstream of them (the + suffix). This catches both direct changes and cascading impacts.
  • --defer tells dbt to read from the production manifest for any unmodified upstream models rather than rebuilding them. Your CI job doesn’t need to rebuild the entire DAG — just the changed portion.
  • --state points to the directory containing the production manifest for comparison.

The result: a CI job that might take 45 minutes to rebuild everything completes in 2-3 minutes by focusing on what actually changed. Compute costs drop by 90%+ while still catching regressions in modified models and their downstream consumers.

For dbt Core 1.8+, the --empty flag takes this further by validating contracts and schema structure without processing any data at all:

Terminal window
dbt build --select state:modified+ --empty --defer --state ./prod-manifest

This enables governance-only CI checks that complete in seconds at near-zero warehouse cost. Schema validation, contract enforcement, and compilation checks all run without scanning a single row.

GitHub Actions Workflow

For teams on dbt Core (or wanting more control than dbt Cloud’s built-in CI), a GitHub Actions workflow covers the essential quality gates:

name: dbt CI
on:
pull_request:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Lint SQL
run: sqlfluff lint models/
- name: Build and test modified models
run: dbt build --select state:modified+ --defer --state ./
- name: Run data tests
run: dbt test

This workflow enforces three layers of quality:

  1. SQL linting catches style violations and common SQL anti-patterns before they reach the warehouse. SQLFluff is the standard linter for dbt projects, with native Jinja templating support.
  2. Build and test compiles models, materializes them in a CI schema, and runs associated generic and singular tests. Compilation failures surface contract violations and type mismatches. Test failures surface data quality issues.
  3. Full test suite runs all tests, including those not directly associated with modified models but potentially affected by schema changes.

The state:modified+ pattern requires a production manifest artifact. Store it as a GitHub Actions artifact from your production deployment, in cloud storage, or in a dedicated branch. Without it, CI falls back to rebuilding everything.

Beyond Build-and-Test: Regression Detection Tools

The standard CI workflow catches structural and rule-based issues but misses value-level regressions. A model might build successfully and pass all tests while producing subtly different numbers due to a logic change. Three tools fill this gap.

Datafold

Datafold performs automated data diffing between your PR branch and production. It compares row counts, column-level statistics, and actual values across corresponding tables. The output appears directly in your pull request as a comment showing which columns changed, by how much, and in which rows.

This catches the category of bugs where a refactored model produces slightly different results — perhaps a COALESCE that now handles NULLs differently, or a join condition that matches fewer rows. These changes pass every test but produce different numbers in reports.

Recce

Recce analyzes column-level lineage to categorize changes as breaking, partial-breaking, or non-breaking. Rather than comparing data values, it examines the structural impact of your SQL changes on downstream consumers. A column rename is classified differently from a filter change, and each category triggers appropriate review requirements.

This approach is lighter-weight than full data diffing and works well for large-scale projects where comparing every row is prohibitively expensive.

SQLFluff

Beyond basic linting, SQLFluff enforces SQL style consistency across a team. In a CI context, it prevents style regressions from merging and ensures that all dbt SQL follows the same conventions. This isn’t data quality per se, but code quality directly impacts data quality over time — inconsistent SQL is harder to review, harder to debug, and more likely to contain subtle logic errors.

What CI Catches That Production Doesn’t

The value of CI testing isn’t just “catching bugs earlier.” It fundamentally changes the failure mode:

Failure typeCaught in productionCaught in CI
Schema mismatchAfter bad data is writtenBefore merge
Contract violationAfter model fails to buildBefore merge
Test failureAfter bad data propagates downstreamBefore merge
Value regressionAfter stakeholders notice wrong numbersBefore merge (with Datafold)
SQL anti-patternsNever (unless they cause visible bugs)Before merge (with SQLFluff)

The difference in cost is significant. A Forrester Total Economic Impact study found a 30% boost in developer productivity and 60% savings in data rework time for teams adopting dbt Cloud with CI workflows. The gains come not from the CI infrastructure itself but from the shift-left effect: problems found in a PR take minutes to fix, while problems found in production take hours or days.

Adoption sequence

  • state:modified+ with basic tests eliminates the most common regression category (changes that break downstream models) and is the highest-leverage starting point.
  • SQLFluff in CI prevents style debt from accumulating. Configure once; the initial cleanup of existing violations is the main effort.
  • Datafold and similar data-diffing tools provide the deepest protection but require infrastructure (comparison schemas, artifact storage) and compute budget. Most valuable for teams where value-level regressions are costly — finance, regulatory reporting, pricing models.
  • Branch protection rules should require passing CI checks before merging to main. A CI workflow that does not block merges tends to be ignored.