ServicesAboutNotesContact Get in touch →
EN FR
Note

dbt-utils Generic Tests

Full reference for dbt-utils generic tests: YAML syntax, the Fusion arguments: key change, group_by_columns support, and when to use each test.

Planted
dbttestingdata quality

dbt-utils ships a suite of generic tests that go well beyond the four built-in tests (unique, not_null, accepted_values, relationships). They’re the first extension most dbt projects add, and for good reason — unique_combination_of_columns and expression_is_true alone cover situations that come up in nearly every project.

For where these fit in the broader testing picture, see dbt Testing Taxonomy.

Fusion / dbt Core 1.10.6+ Syntax Change

Before using any dbt-utils tests, check your dbt version. Starting with dbt Fusion (dbt 2.0) and dbt Core 1.10.6+, test arguments must be nested under an arguments: key in YAML:

# dbt Core < 1.10.6 (original syntax)
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- revenue__date
- revenue__currency
# dbt Fusion / dbt Core 1.10.6+ (new syntax)
- dbt_utils.unique_combination_of_columns:
arguments:
combination_of_columns:
- revenue__date
- revenue__currency

If you upgrade dbt and your tests start failing with argument-related errors, this is the first thing to check. The examples below show the original syntax — add the arguments: wrapper if you’re on a newer version.

The Key Tests

unique_combination_of_columns

Tests composite uniqueness — that no combination of the specified columns repeats. This is the test for any model where the grain is defined by multiple columns:

models:
- name: mrt__finance__daily_revenue
data_tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- revenue__date
- revenue__currency

Use this wherever unique on a single column wouldn’t be meaningful. A fact table with one row per order per product per day needs this test on (order__id, product__id, order__date), not unique on any individual column.

expression_is_true

Evaluates any SQL expression per row. It fails if any row produces FALSE or NULL. This is the Swiss army knife for assertions that don’t fit neatly into other test types:

models:
- name: mrt__finance__invoices
data_tests:
- dbt_utils.expression_is_true:
expression: "invoice__total_amount >= invoice__tax_amount"
where: "invoice__status != 'voided'"

Common uses:

  • Cross-column validation: end_date >= start_date, quantity > 0
  • Computed column checks: discount_amount <= subtotal
  • Status-dependent checks: shipped_at IS NOT NULL where status = 'shipped'

Migration note from v0.x: The old condition argument was removed in v1.0. Use the native where config instead (as shown above). Any test YAML with condition: needs updating.

accepted_range

Validates that numeric values fall within bounds:

columns:
- name: order__total_amount
data_tests:
- dbt_utils.accepted_range:
min_value: 0
inclusive: true
- name: conversion__rate
data_tests:
- dbt_utils.accepted_range:
min_value: 0
max_value: 1
inclusive: true

You can specify min_value, max_value, or both. The inclusive parameter controls whether the bounds are inclusive (>=/<=) or exclusive (>/<).

This catches impossible values before they corrupt metrics — negative revenue, conversion rates above 100%, session durations of negative seconds. Run it on any numeric column with meaningful bounds.

recency

Checks that data is fresh by verifying the most recent value in a date column is within a specified interval:

models:
- name: base__stripe__charges
data_tests:
- dbt_utils.recency:
datepart: hour
field: charge__created_at
interval: 24

This test fails if the newest row in charge__created_at is more than 24 hours old. Set datepart to match your data’s natural granularity.

This is a model-level freshness check, distinct from source freshness checks. Source freshness (dbt source freshness) checks whether the source was recently loaded. recency checks whether the model’s output contains recent data, which also catches transformation bugs that accidentally filter out recent rows.

Complete Test Reference

TestWhat it validates
unique_combination_of_columnsNo repeated combination of the specified columns
accepted_rangeNumeric values fall within min/max bounds
expression_is_trueAn arbitrary SQL expression is true for every row
recencyMost recent value in a date column is within N dateparts
at_least_oneAt least one non-null value exists in the column
not_constantValues vary — not all rows have the same value
not_null_proportionNull rate does not exceed a threshold (e.g., at_least: 0.95)
relationships_whereFiltered referential integrity with a WHERE clause
mutually_exclusive_rangesDate/number ranges have no overlaps (both bounds must be NOT NULL)
sequential_valuesA sequence column has no gaps
equal_rowcountTwo relations have the same row count
fewer_rows_thanRelation A has strictly fewer rows than relation B
equalityFull model comparison (with optional precision for numeric columns)
not_accepted_valuesInverse of accepted_values — values are not in the specified list
not_empty_stringStrings are non-empty (with trim_whitespace option)
cardinality_equalityTwo columns have the same number of distinct values

Tests Worth Knowing More About

not_null_proportion is useful where hard not_null is too strict. An email column with 5% nulls might be acceptable; 40% nulls might indicate a broken pipeline. Set a threshold and test against it:

- dbt_utils.not_null_proportion:
at_least: 0.95

mutually_exclusive_ranges validates that a table of intervals (date ranges, price tiers, validity periods) has no overlaps. A billing system that accidentally generates two active pricing tiers is a category of bug this test catches directly.

equality compares two models row-by-row. It’s the right tool for validating a refactored model produces identical output to its predecessor — essential during migrations. The precision argument handles floating-point comparison correctly rather than requiring exact equality on computed numerics.

relationships_where is the filtered version of the built-in relationships test. Use it when referential integrity only applies for a subset of rows:

- dbt_utils.relationships_where:
to: ref('mrt__core__products')
field: product__id
from_condition: "order__status != 'cancelled'"

group_by_columns

Several tests support a group_by_columns parameter that validates the test condition within each group separately. This is useful for segmented validation — checking freshness per data source, or confirming row counts match per region:

Tests that support group_by_columns: equal_rowcount, fewer_rows_than, recency, at_least_one, not_constant, sequential_values, not_null_proportion.

- dbt_utils.recency:
datepart: hour
field: charge__created_at
interval: 24
group_by_columns:
- payment__processor

This checks that each distinct payment__processor value has recent data, rather than checking the table as a whole. A single active processor masks a dormant one without this parameter.