dbt-utils ships a suite of generic tests that go well beyond the four built-in tests (unique, not_null, accepted_values, relationships). They’re the first extension most dbt projects add, and for good reason — unique_combination_of_columns and expression_is_true alone cover situations that come up in nearly every project.
For where these fit in the broader testing picture, see dbt Testing Taxonomy.
Fusion / dbt Core 1.10.6+ Syntax Change
Before using any dbt-utils tests, check your dbt version. Starting with dbt Fusion (dbt 2.0) and dbt Core 1.10.6+, test arguments must be nested under an arguments: key in YAML:
# dbt Core < 1.10.6 (original syntax)- dbt_utils.unique_combination_of_columns: combination_of_columns: - revenue__date - revenue__currency
# dbt Fusion / dbt Core 1.10.6+ (new syntax)- dbt_utils.unique_combination_of_columns: arguments: combination_of_columns: - revenue__date - revenue__currencyIf you upgrade dbt and your tests start failing with argument-related errors, this is the first thing to check. The examples below show the original syntax — add the arguments: wrapper if you’re on a newer version.
The Key Tests
unique_combination_of_columns
Tests composite uniqueness — that no combination of the specified columns repeats. This is the test for any model where the grain is defined by multiple columns:
models: - name: mrt__finance__daily_revenue data_tests: - dbt_utils.unique_combination_of_columns: combination_of_columns: - revenue__date - revenue__currencyUse this wherever unique on a single column wouldn’t be meaningful. A fact table with one row per order per product per day needs this test on (order__id, product__id, order__date), not unique on any individual column.
expression_is_true
Evaluates any SQL expression per row. It fails if any row produces FALSE or NULL. This is the Swiss army knife for assertions that don’t fit neatly into other test types:
models: - name: mrt__finance__invoices data_tests: - dbt_utils.expression_is_true: expression: "invoice__total_amount >= invoice__tax_amount" where: "invoice__status != 'voided'"Common uses:
- Cross-column validation:
end_date >= start_date,quantity > 0 - Computed column checks:
discount_amount <= subtotal - Status-dependent checks:
shipped_at IS NOT NULLwherestatus = 'shipped'
Migration note from v0.x: The old condition argument was removed in v1.0. Use the native where config instead (as shown above). Any test YAML with condition: needs updating.
accepted_range
Validates that numeric values fall within bounds:
columns: - name: order__total_amount data_tests: - dbt_utils.accepted_range: min_value: 0 inclusive: true - name: conversion__rate data_tests: - dbt_utils.accepted_range: min_value: 0 max_value: 1 inclusive: trueYou can specify min_value, max_value, or both. The inclusive parameter controls whether the bounds are inclusive (>=/<=) or exclusive (>/<).
This catches impossible values before they corrupt metrics — negative revenue, conversion rates above 100%, session durations of negative seconds. Run it on any numeric column with meaningful bounds.
recency
Checks that data is fresh by verifying the most recent value in a date column is within a specified interval:
models: - name: base__stripe__charges data_tests: - dbt_utils.recency: datepart: hour field: charge__created_at interval: 24This test fails if the newest row in charge__created_at is more than 24 hours old. Set datepart to match your data’s natural granularity.
This is a model-level freshness check, distinct from source freshness checks. Source freshness (dbt source freshness) checks whether the source was recently loaded. recency checks whether the model’s output contains recent data, which also catches transformation bugs that accidentally filter out recent rows.
Complete Test Reference
| Test | What it validates |
|---|---|
unique_combination_of_columns | No repeated combination of the specified columns |
accepted_range | Numeric values fall within min/max bounds |
expression_is_true | An arbitrary SQL expression is true for every row |
recency | Most recent value in a date column is within N dateparts |
at_least_one | At least one non-null value exists in the column |
not_constant | Values vary — not all rows have the same value |
not_null_proportion | Null rate does not exceed a threshold (e.g., at_least: 0.95) |
relationships_where | Filtered referential integrity with a WHERE clause |
mutually_exclusive_ranges | Date/number ranges have no overlaps (both bounds must be NOT NULL) |
sequential_values | A sequence column has no gaps |
equal_rowcount | Two relations have the same row count |
fewer_rows_than | Relation A has strictly fewer rows than relation B |
equality | Full model comparison (with optional precision for numeric columns) |
not_accepted_values | Inverse of accepted_values — values are not in the specified list |
not_empty_string | Strings are non-empty (with trim_whitespace option) |
cardinality_equality | Two columns have the same number of distinct values |
Tests Worth Knowing More About
not_null_proportion is useful where hard not_null is too strict. An email column with 5% nulls might be acceptable; 40% nulls might indicate a broken pipeline. Set a threshold and test against it:
- dbt_utils.not_null_proportion: at_least: 0.95mutually_exclusive_ranges validates that a table of intervals (date ranges, price tiers, validity periods) has no overlaps. A billing system that accidentally generates two active pricing tiers is a category of bug this test catches directly.
equality compares two models row-by-row. It’s the right tool for validating a refactored model produces identical output to its predecessor — essential during migrations. The precision argument handles floating-point comparison correctly rather than requiring exact equality on computed numerics.
relationships_where is the filtered version of the built-in relationships test. Use it when referential integrity only applies for a subset of rows:
- dbt_utils.relationships_where: to: ref('mrt__core__products') field: product__id from_condition: "order__status != 'cancelled'"group_by_columns
Several tests support a group_by_columns parameter that validates the test condition within each group separately. This is useful for segmented validation — checking freshness per data source, or confirming row counts match per region:
Tests that support group_by_columns: equal_rowcount, fewer_rows_than, recency, at_least_one, not_constant, sequential_values, not_null_proportion.
- dbt_utils.recency: datepart: hour field: charge__created_at interval: 24 group_by_columns: - payment__processorThis checks that each distinct payment__processor value has recent data, rather than checking the table as a whole. A single active processor masks a dormant one without this parameter.