ServicesAboutNotesContact Get in touch →
EN FR
Note

dbt Testing Decision Framework

A three-question framework and decision tree for choosing the right dbt testing approach — unit tests, generic tests, singular tests, dbt-expectations, Elementary, or dbt-audit-helper.

Planted
dbttestingdata quality

A three-question framework for choosing the right dbt testing approach among native unit tests, generic tests, singular tests, dbt-expectations, Elementary, and dbt-audit-helper. Answer the questions in order to identify the appropriate tool for any scenario.

Question 1: What Are You Testing?

The first question is the most important. Different categories of problems call for fundamentally different tools.

Testing forRecommended approachWhy
Transformation logic correctnessUnit testsMocked inputs isolate logic from data issues
Primary key integrityGeneric tests (unique + not_null)Simple, built-in, runs on actual data
Referential integrityGeneric tests (relationships)Built-in foreign key validation
Data freshness SLAsdbt-utils recency OR ElementaryExplicit threshold vs. adaptive detection
Volume anomaliesElementary (volume_anomalies)No need to guess thresholds
Schema driftElementary (schema_changes)Automatic monitoring
Complex business rulesSingular testsFull SQL flexibility
Regex/pattern validationUnit tests OR dbt-expectationsLogic testing vs. data validation
Migration accuracydbt-audit-helperRow-level comparison
Statistical distributiondbt-expectationsMean, median, percentile checks

The pattern here: if you’re testing logic (is my SQL correct?), you want unit tests with mocked data. If you’re testing data (is my production data healthy?), you want data tests against the warehouse. If you’re testing for surprises (is something weird happening that I didn’t anticipate?), you want Elementary’s anomaly detection.

Question 2: When Do You Want to Catch Issues?

Timing matters as much as tool choice. Different tools run at different points in the pipeline lifecycle, and placing them correctly determines whether you catch issues before or after damage is done.

TimingTest typeRationale
Before bad data enters the warehouseUnit testsCatch logic bugs in CI, before merge
After transformation, before downstreamData testsGate failed models, prevent propagation
Continuously in productionElementaryOngoing anomaly monitoring
During development/refactoringdbt-audit-helperValidate changes against baseline

Unit tests gate deployments. They run in CI when you push code changes. If a unit test fails, the code change doesn’t merge. They never see production data, which means they can’t protect you from data quality issues — but they catch the logic bugs that no amount of data testing would find.

Data tests gate production data flow. They run on every dbt build against actual warehouse data. A failed data test with error severity blocks downstream models from running. They catch the problems that correct code can’t prevent: source systems sending NULLs where they never did before, duplicate records from vendor APIs, sudden shifts in data volume.

Elementary runs continuously in production, learning patterns from historical data. It catches the “unknown unknowns” — anomalies you wouldn’t think to write explicit tests for.

dbt-audit-helper is a development tool. Use it when refactoring models or migrating from legacy systems. It proves that your changes produce identical results. It has no place in production pipelines.

Question 3: What’s Your Cost and Complexity Tolerance?

Every testing approach has a setup cost, a runtime cost, and an ongoing maintenance cost. The right answer for a solo consultant managing 30 models is different from a data team managing 500.

ApproachSetup effortRuntime costMaintenance
Unit tests onlyMediumLow (mocked data)Low
Generic tests onlyLowMedium (table scans)Low
dbt-expectationsMediumMediumMedium (many test options)
ElementaryHigh (requires setup)Higher (historical queries)Medium
Full stackHighHighestOngoing

The honest recommendation: start with generic tests (low effort, broad coverage) and add complexity only when you have evidence it’s needed. A test suite of unique + not_null on every primary key catches more production incidents than a sophisticated Elementary deployment that nobody maintains.

The Decision Tree

When you’re staring at a model in a code review and asking “what kind of test does this need?”, walk through this tree:

Is the issue about LOGIC or DATA?
|
+-- LOGIC (calculation, transformation, edge case)
| +-- Is the logic complex enough to plausibly have bugs?
| +-- Yes --> Unit test
| +-- No (simple passthrough, basic aggregation) --> Skip
|
+-- DATA (integrity, quality, freshness)
|
+-- Do you know the exact threshold or rule?
| +-- Yes, simple rule (unique, not null, accepted values)
| | +-- Generic test
| +-- Yes, complex rule (regex, cross-column, custom SQL)
| | +-- Singular test or dbt-expectations
| +-- No, want adaptive detection
| +-- Elementary
|
+-- Are you validating a migration?
+-- dbt-audit-helper

The tree starts with the fundamental split: logic vs. data. This single distinction eliminates half the options immediately. If you’re testing logic, unit tests are the only tool that makes sense — data tests can’t verify logic because they don’t control their inputs. If you’re testing data, unit tests are irrelevant because they never see real data.

Within the data branch, the split is between explicit rules (you know what “correct” looks like) and adaptive detection (you want the system to learn what “normal” looks like). Explicit rules belong in generic tests, singular tests, or dbt-expectations. Adaptive detection belongs in Elementary.

Quick Reference: Test Selection Cheat Sheet

For the code reviewer who just needs a fast answer:

ScenarioUse thisExample
Complex calculationUnit testRevenue with discounts, taxes, fees
Primary key checkunique + not_nullEvery model’s ID column
Foreign key checkrelationshipsorder.customer_id to customer.id
Enum validationaccepted_valuesstatus in [‘active’, ‘pending’, ‘closed’]
Regex patterndbt_expectations.expect_column_values_to_match_regexEmail format
Numeric boundsdbt_expectations.expect_column_values_to_be_betweenorder_value 0-100000
Row count stabilityElementary volume_anomaliesFact tables
Data freshnessdbt_utils.recency or ElementarySource-adjacent staging
Schema monitoringElementary schema_changesAll models
Custom business ruleSingular test”No orders without line items”
Migration validationdbt-audit-helperScheduled query to dbt comparison

This cheat sheet is a compressed version of the framework above. For uncertain cases, the three questions — what are you testing, when do you want to catch it, and what is the cost and complexity tolerance — provide the fuller analysis.