dbt’s four native tests (unique, not_null, accepted_values, and relationships) were revolutionary when they launched. For the first time, analytics engineers could version-control their data quality assertions alongside their transformations. But in 2026, these tests are table stakes, not a testing strategy.
The 2025 dbt Labs State of Analytics Engineering survey confirms this: 56% of practitioners now cite poor data quality as their most frequent challenge, up from 41% in 2022. And 38% plan increased investment in data quality and observability tools this year. The community has realized that four generic tests aren’t enough for production pipelines.
The good news: the dbt ecosystem has matured to meet this need. You can implement anomaly detection, freshness monitoring, schema drift protection, and data contracts without leaving your dbt project. Here’s how.
The Limits of Native dbt Tests
Before adding tools, understand what you’re missing. Native dbt tests run only at build-time. They check whether data meets constraints right now, but they don’t track whether yesterday’s pass rate was normal. They can’t tell you if your table’s row count dropped by 30% compared to the weekly average.
Consider a source table that normally receives 10,000 rows per day. One morning, it gets 2,000. A not_null test on the primary key passes. A unique test passes. Every row-level constraint you wrote passes. But your downstream dashboards are now missing 80% of the data.
Native tests validate rules. Production pipelines need pattern recognition.
Six Testing Patterns for Production Pipelines
Advanced testing falls into six categories, each addressing a specific failure mode.
Freshness Monitoring
Your data is worthless if it’s stale. Freshness tests assert that data arrives within expected timeframes.
dbt-expectations provides expect_row_values_to_have_recent_data and expect_grouped_row_values_to_have_recent_data for SLA assertions at table and partition levels. Elementary extends this with freshness_anomalies, which monitors time between updates, and event_freshness_anomalies, which tracks the lag between when an event occurred and when it was loaded.
tests: - dbt_expectations.expect_row_values_to_have_recent_data: column_name: loaded_at interval: 1 interval_type: hourVolume Anomaly Detection
When upstream pipelines fail silently, you often don’t get zero rows. You get fewer rows. Volume tests establish baselines and flag deviations.
Elementary’s volume_anomalies test uses statistical methods to learn what “normal” looks like for each table. It alerts when today’s count falls outside expected bounds. For simpler cases, dbt_expectations.expect_table_row_count_to_be_between provides manual threshold validation.
tests: - elementary.volume_anomalies: where: "event_date = current_date()" time_bucket: period: day count: 1Schema Drift Detection
A column rename in your source system breaks every model that references it. Schema tests catch these changes before they propagate.
Elementary’s schema_changes test alerts on deleted tables, added or deleted columns, and type changes. The schema_changes_from_baseline variant validates against an explicitly defined schema, useful when you need to guarantee specific column structures for downstream consumers.
Distribution Checks
Some data quality problems are invisible to row-level tests. A float column that usually averages 50 suddenly averaging 500 passes every constraint but produces nonsense reports.
Distribution tests detect statistical outliers across time windows. The dbt_expectations.expect_column_values_to_be_within_n_moving_stdevs test catches values that deviate from historical patterns. Elementary tracks column-level anomalies across metrics including average, standard deviation, min, max, null count, null percent, zero count, and count distinct.
tests: - elementary.column_anomalies: column_name: order__amount column_anomalies: - average - max anomaly_sensitivity: 3Semantic Validation
Business rules belong in tests, not just documentation. Semantic tests encode domain logic directly.
Regex tests like expect_column_values_to_match_regex validate format constraints: email patterns, phone numbers, product codes. Elementary introduced AI-powered validation with natural language expectations:
tests: - elementary.ai_data_validation: prompt: "There should be no contract date in the future"For complex business rules that are hard to express in SQL, natural language tests let you capture intent directly.
Unit Tests for SQL Logic
dbt 1.8 introduced native unit tests, which validate transformation logic with static mock inputs before the query ever touches your warehouse. Unlike data tests that run against real data, unit tests verify that your SQL does what you expect.
unit_tests: - name: test_revenue_calculation model: mrt__finance__orders given: - input: ref('base__shopify__orders') rows: - {order_id: 1, order__quantity: 2, order__unit_price: 10} expect: rows: - {order_id: 1, order__revenue: 20}Unit tests catch logic errors without warehouse compute costs and without depending on specific data existing in your environment.
The Packages That Power Advanced Testing
Three packages form the backbone of advanced dbt testing.
dbt-expectations (currently version 0.10.x, maintained by Metaplane) provides 60+ tests inspired by the Great Expectations framework. It covers schema and regex validations, time-series freshness, statistical distributions, and cross-column logic. If you need one package to extend native testing, start here.
dbt-utils (version 1.3.0) adds helper macros including fewer_rows_than, at_least_one, and cardinality_equality. These complement dbt-expectations rather than replacing it.
Elementary (version 0.22.0) delivers dbt-native observability with built-in anomaly detection. It’s trusted by over 5,000 data professionals and provides both the testing framework and the dashboard to visualize results.
The choice between them isn’t either-or. Most mature projects use dbt-expectations for explicit rule-based tests and Elementary for anomaly detection and observability.
Data Contracts: When to Enforce Schema
dbt introduced native data contracts in version 1.5. When enabled, contracts enforce column structure during every dbt build, failing fast on violations before data is written.
models: - name: mrt__core__customers config: contract: enforced: true columns: - name: customer_id data_type: string constraints: - type: not_null - type: unique - name: customer__email data_type: string - name: customer__created_at data_type: timestampThree contract types have emerged in practice:
- Schema contracts enforce column names, data types, order, and precision
- Semantic contracts validate metric deviations, business logic, and referential integrity
- SLA contracts assert freshness expectations and update schedules
The practical guidance from teams using contracts: they’re “designed to slow people down.” The friction is intentional. Apply contracts strategically on high-criticality public models (the marts that feed executive dashboards, the tables that drive regulatory reporting) rather than everywhere. The friction is acceptable where the cost of breakage is high.
ML-Based Anomaly Detection
Rule-based tests require you to anticipate failure modes. ML-based detection learns what “normal” looks like and alerts on deviations you didn’t predict.
Elementary uses Z-score based detection with configurable sensitivity. A sensitivity of 3 means alerting when values fall more than 3 standard deviations from the mean. This works well for most cases but can’t adapt to complex seasonal patterns.
tests: - elementary.column_anomalies: column_anomalies: - average anomaly_sensitivity: 2 training_period: period: day count: 14More sophisticated tools like Monte Carlo and Anomalo use advanced ML that learns from historical patterns, handling seasonality and trend changes automatically. Monte Carlo’s customers report an 80% reduction in data downtime. Both come with higher cost and integration complexity.
For most dbt-centric teams, Elementary’s approach hits the sweet spot: meaningful anomaly detection without leaving the dbt ecosystem or adding significant cost.
Integrating Quality Checks into CI/CD
Testing in production catches problems, but testing in CI prevents them from reaching production in the first place.
dbt Cloud’s Slim CI pattern runs only modified models and their downstream dependencies:
dbt build --select state:modified+ --defer --state ./This uses manifest comparison to avoid rebuilding unchanged models, dramatically reducing compute costs while still catching regressions.
For GitHub Actions or similar CI systems:
name: dbt CIon: pull_request: branches: [main]jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Lint SQL run: sqlfluff lint models/ - name: Build and test modified models run: dbt build --select state:modified+ --defer --state ./ - name: Run data tests run: dbt testTools that enhance CI workflows:
- Datafold compares data between your PR branch and production, catching value-level changes that pass schema tests
- Recce analyzes column-level lineage to categorize changes as breaking, partial-breaking, or non-breaking
- SQLFluff enforces SQL style consistency
A Forrester Total Economic Impact study found a 30% boost in developer productivity and 60% savings in data rework time for teams adopting dbt Cloud with CI workflows.
What This Costs in Practice
Investment in data quality isn’t abstract. The 2025 dbt Labs survey found that analytics engineers still spend 57% of their time maintaining and organizing datasets rather than building new capabilities.
Industry-wide, the numbers are stark: organizations lose an estimated $12.9 million annually through poor data quality, according to Gartner research. A Monte Carlo survey found that data engineers spend 40% of their workdays on data quality issues.
High-profile failures illustrate the stakes:
- Unity Technologies lost $110 million in Q1 2022 when corrupted data broke their ML models
- JPMorgan Chase was fined approximately $350 million in 2024 for incomplete trading data in surveillance systems
- Public Health England lost 16,000 positive COVID tests in 2020 when Excel’s row limit silently truncated records
A Monte Carlo survey found data downtime nearly doubled year-over-year, with time-to-resolution increasing by 166%. Prevention costs a fraction of remediation.
Where to Start
You don’t need to implement everything at once. Pick one pattern based on your current pain:
If dashboards go stale without warning: Start with freshness monitoring. Add Elementary’s freshness_anomalies to your most critical sources.
If upstream changes break your models: Implement schema drift detection. schema_changes tests catch problems before they propagate.
If you’ve been bitten by volume drops: Add volume_anomalies to tables that receive regular data loads.
If you need guarantees for downstream consumers: Enable data contracts on your public-facing marts.
The dbt ecosystem has evolved past basic testing. The tools exist, they’re mature, and they integrate directly into your existing workflow. Pick the one that solves your most pressing problem today.