ServicesAboutNotesContact Get in touch →
EN FR
Note

dbt Test Severity and Performance Tuning

How to configure dbt test severity levels, optimize expensive tests on BigQuery, and structure test execution for cost-effective data quality.

Planted
dbtbigquerydata qualitytestingcost optimization

dbt test severity and scheduling determine whether a failure blocks the pipeline and when tests run. Misconfigured severity or unconstrained test execution leads to either silent failures or pipelines that stop unnecessarily.

Severity Levels

dbt supports two severity levels for tests: error (the default) and warn.

error stops the pipeline. If you run dbt build, an error-severity test failure prevents downstream models from running. Use this for problems that would produce incorrect data if the pipeline continued:

  • Primary key violations (duplicates cause join explosions)
  • Freshness failures on critical tables (stale data is worse than no data)
  • NULL values in required fields that feed dashboards
  • Value ranges where violations indicate data corruption

warn logs the failure but lets the pipeline continue. Use this for problems that need investigation but shouldn’t block production:

columns:
- name: order_value
tests:
- dbt_expectations.expect_column_mean_to_be_between:
min_value: 50
max_value: 200
config:
severity: warn

Statistical tests like expect_column_mean_to_be_between are natural candidates for warn severity. A shift in average order value might signal a real problem, or it might reflect a seasonal trend, a successful promotion, or a market shift. You want visibility without halting production while you investigate.

Severity Guidelines by Test Type

Test categoryRecommended severityReasoning
Primary key (unique, not_null)errorDuplicates and nulls in keys corrupt all downstream joins
Freshness (expect_row_values_to_have_recent_data)error on critical tables, warn on othersStale data in a finance mart is an emergency; stale data in a marketing staging table can wait
Format (expect_column_values_to_match_regex)errorInvalid formats typically indicate upstream schema changes
Range (expect_column_values_to_be_between)errorOut-of-range values usually indicate data corruption
Statistical (expect_column_mean_to_be_between)warnDistribution shifts need investigation, not pipeline halts
Volume (expect_table_row_count_to_be_between)warn or errorDepends on how tightly you can define expected bounds
Completeness (expect_row_values_to_have_data_for_every_n_datepart)warnMissing days may reflect source delays, not failures

The goal is a test suite where error means “stop everything and fix this now” and warn means “look at this today.” If your team ignores warnings, you have too many of them. If your pipeline stops daily, you have too many errors on volatile tests.

Performance on BigQuery

Data quality tests run SQL against your warehouse. On BigQuery’s on-demand pricing, every test scans data and costs money. Some tests are cheap (a COUNT(*) with a partition filter). Others are expensive (a full-table statistical computation). Managing this cost is a first-class concern.

Filter on Partition Columns

If your table is partitioned by date, always use row_condition to filter on the partition column:

- dbt_expectations.expect_column_values_to_be_between:
min_value: 0
max_value: 1000000
row_condition: "event_date >= current_date() - 30"

Without this filter, BigQuery scans the entire table. With it, BigQuery prunes to the last 30 partitions. On a table with years of event data, this can reduce bytes scanned — and cost — by 90% or more.

This is especially important for these tests, which tend to be the most expensive:

  • expect_row_values_to_have_data_for_every_n_datepart (generates a date spine and joins)
  • expect_column_mean_to_be_between (aggregates across all rows)
  • expect_column_values_to_match_regex (evaluates a function on every row)
  • expect_table_row_count_to_equal_other_table (full count on two tables)

Run Expensive Tests Only in Production

Some tests add value in production but are wasteful in development or CI. Use dbt’s enabled config with target-aware Jinja:

- dbt_expectations.expect_column_mean_to_be_between:
min_value: 50
max_value: 200
config:
enabled: "{{ target.name == 'prod' }}"

This compiles the test only when running against the prod target. In development and CI, the test is skipped entirely — no compilation, no warehouse query, no cost.

Use this pattern for:

  • Statistical tests that need production-scale data to be meaningful
  • Volume checks where development datasets are intentionally small
  • Cross-table comparisons where both tables must be fully materialized

Tag-Based Test Scheduling

For tests that should run on a schedule rather than every build, use dbt tags:

- dbt_expectations.expect_row_values_to_have_data_for_every_n_datepart:
date_col: event_date
date_part: day
config:
tags: ['slow', 'daily']

Then in your CI pipeline, exclude slow tests:

Terminal window
dbt test --exclude tag:slow

And in your daily production run, include everything:

Terminal window
dbt test

Or run slow tests on their own schedule:

Terminal window
dbt test --select tag:slow

This gives you fast CI feedback (under 5 minutes) while still running comprehensive validation daily. The tag names are arbitrary — slow, daily, weekly, expensive, nightly — pick whatever convention your team will actually use.

Estimating Test Costs

A rough framework for BigQuery on-demand pricing ($6.25 per TB scanned as of 2026):

Test typeTypical scanCost on a 100GB table
unique / not_null (single column)Column only~$0.01
expect_column_values_to_be_between (with partition filter)Filtered partition~$0.001
expect_column_values_to_be_between (no filter)Full column~$0.01
expect_column_mean_to_be_between (full table)Full column~$0.01
expect_row_values_to_have_data_for_every_n_datepart (full table)Full table + date spine~$0.06
expect_table_row_count_to_equal_other_tableTwo full tables~$0.12

Individual tests are cheap. The cost adds up when you have 200+ tests running multiple times per day across dozens of models. Partition filtering and tag-based scheduling are the two highest-leverage optimizations.

Test Placement by Layer

Where you place tests in your three-layer architecture affects both coverage and cost.

Sources and base models: Focus on freshness, schema validation, and basic format checks. This is where problems enter your pipeline. Catching them here prevents cascading failures downstream.

models:
- name: base__ga4__events
tests:
- dbt_expectations.expect_row_values_to_have_recent_data:
datepart: hour
interval: 48
columns:
- name: user_pseudo_id
tests:
- dbt_expectations.expect_column_values_to_match_regex:
regex: '^[0-9]+\\.[0-9]+$'

Intermediate models: Focus on join integrity and transformation validation. Verify that joins don’t drop or duplicate rows unexpectedly. These tests are often the first signal that something broke upstream.

Marts: Focus on business rules and aggregation sanity checks. These tests protect your final outputs — the tables that BI tools query and stakeholders rely on.

models:
- name: mrt__marketing__campaign_performance
columns:
- name: roas
tests:
- dbt_expectations.expect_column_values_to_be_between:
min_value: 0
max_value: 100
row_condition: "spend > 0"

The general principle: test intensity should increase toward the edges of your DAG. Sources are where problems enter. Marts are where they reach consumers. The middle layers get lighter coverage focused on join integrity.

Starting Point for New Projects

When adding dbt-expectations to an existing project, a minimal set of three tests on the most important model covers the most common gaps native dbt tests miss:

  1. Freshness: expect_row_values_to_have_recent_data on the main timestamp column
  2. Format: expect_column_values_to_match_regex on one key identifier column
  3. Range: expect_column_values_to_be_between on one numeric KPI column

Expand from there based on which failures actually occur in production.