A reference for the highest-value dbt-expectations tests by category — tests that fill gaps native dbt tests cannot cover. All examples use BigQuery-compatible SQL. dbt-expectations ships 50+ tests total; this note covers the most broadly applicable.
Table-Level Tests
expect_row_values_to_have_recent_data
Native dbt only offers freshness checks on sources. This test works on any model, catching stale data before dashboards show outdated numbers.
models: - name: mrt__sales__orders columns: - name: order_timestamp tests: - dbt_expectations.expect_row_values_to_have_recent_data: datepart: hour interval: 24This fails if no rows have an order_timestamp within the last 24 hours. For GA4 data, which typically has a 24-48 hour delay, set interval: 48. For daily batch models, use datepart: day with interval: 2 to allow for weekends or holidays.
expect_table_row_count_to_equal_other_table
Transformations should not silently drop rows. This test catches it when they do:
models: - name: mrt__sales__orders tests: - dbt_expectations.expect_table_row_count_to_equal_other_table: compare_model: ref('base__shopify__orders')If your base model has 50,000 rows and your mart has 49,000, something went wrong in between. This is particularly useful during refactoring — when you rewrite a model, the row count should match the original unless you have a specific reason for it to differ.
expect_table_row_count_to_be_between
Detects unexpected volume changes. If your daily batch normally contains 10,000-100,000 rows and suddenly has 500, you want to know before anyone opens a dashboard:
models: - name: base__ga4__events tests: - dbt_expectations.expect_table_row_count_to_be_between: min_value: 10000 max_value: 100000Query your data’s historical volumes to set reasonable bounds. If you’d rather not maintain static thresholds, Elementary’s volume_anomalies test learns the pattern automatically. Use expect_table_row_count_to_be_between when you know the bounds; use Elementary when you want adaptive detection.
Pattern Validation
expect_column_values_to_match_regex
Native dbt has no format validation tests. This test covers that category:
columns: - name: customer_email tests: - dbt_expectations.expect_column_values_to_match_regex: regex: '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$'Other common patterns:
- GA4
user_pseudo_id:'^[0-9]+\\.[0-9]+$' - GA4
event_name:'^[a-z_]+$' - Product SKU like
PRD-1234:'^PRD-[0-9]{4}$' - ISO country code:
'^[A-Z]{2}$'
Always combine regex tests with row_condition on nullable columns. Without it, NULL values fail the regex and produce false positives.
expect_column_values_to_match_like_pattern
When regex is overkill, use SQL LIKE patterns instead. Simpler to read, sufficient for prefix/suffix checks:
columns: - name: product_sku tests: - dbt_expectations.expect_column_values_to_match_like_pattern: like_pattern: 'PRD-%'Value Range Validation
expect_column_values_to_be_between
Validates that values fall within defined bounds. Applies to numeric and date columns:
columns: - name: order_value tests: - dbt_expectations.expect_column_values_to_be_between: min_value: 0 max_value: 1000000
- name: conversion_rate tests: - dbt_expectations.expect_column_values_to_be_between: min_value: 0 max_value: 1
- name: event_date tests: - dbt_expectations.expect_column_values_to_be_between: min_value: "'2020-01-01'" max_value: "current_date()"Note the quoting for string and date values: the outer quotes are YAML, the inner quotes are SQL. "'2020-01-01'" becomes '2020-01-01' in the compiled query.
expect_column_mean_to_be_between
Catches distribution shifts that row-level range checks miss. Individual values may all be within bounds while the overall distribution is anomalous:
columns: - name: order_value tests: - dbt_expectations.expect_column_mean_to_be_between: min_value: 50 max_value: 200Query the data first to establish reasonable bounds (SELECT AVG(order_value) FROM your_model). Bounds should be wide enough that normal fluctuations pass, narrow enough that real anomalies fail.
For columns where the mean is meaningless (bimodal distributions, heavy outliers), pair this with expect_column_median_to_be_between or use Elementary’s column_anomalies instead.
Multi-Column Validation
expect_compound_columns_to_be_unique
Native unique only works on a single column. For tables with a composite primary key:
models: - name: mrt__sales__order_lines tests: - dbt_expectations.expect_compound_columns_to_be_unique: column_list: ["order_id", "line_item_id"]Apply this to every table with a composite primary key. Join explosions from duplicate composite keys are among the hardest bugs to diagnose because they silently inflate metrics rather than producing errors.
expect_column_pair_values_A_to_be_greater_than_B
Validates business logic that spans two columns:
models: - name: mrt__finance__subscriptions tests: - dbt_expectations.expect_column_pair_values_A_to_be_greater_than_B: column_A: end_date column_B: start_date or_equal: true row_condition: "end_date is not null"Common use cases:
shipped_date > order_datetotal_amount >= subtotalupdated_at >= created_atend_date >= start_date
The or_equal: true parameter matters — without it, records where both dates are equal (same-day subscriptions, instant shipments) would fail.
Completeness Tests
expect_row_values_to_have_data_for_every_n_datepart
Detects gaps in time series data. If you’re missing an entire day of GA4 events, this test fails:
columns: - name: event_date tests: - dbt_expectations.expect_row_values_to_have_data_for_every_n_datepart: date_col: event_date date_part: day test_start_date: "'2024-01-01'" test_end_date: "current_date() - 1"Always specify date bounds. Without test_start_date and test_end_date, this test scans your entire table history. On a large GA4 dataset, that means scanning millions of rows to check for gaps going back to the beginning of time. Bound it to a recent window (90 days, a year) to keep costs manageable.
For BigQuery specifically, use partition-aware date expressions:
test_start_date: "date_sub(current_date(), interval 90 day)"test_end_date: "date_sub(current_date(), interval 2 day)"The interval 2 day end bound accounts for GA4’s typical processing delay — you don’t want the test failing because today’s data hasn’t arrived yet.
Quick Reference
| Test | Use case |
|---|---|
expect_row_values_to_have_recent_data | Freshness checks on any model |
expect_table_row_count_to_equal_other_table | Verify transformations don’t drop rows |
expect_table_row_count_to_be_between | Detect volume anomalies |
expect_column_values_to_match_regex | Format validation (emails, IDs, SKUs) |
expect_column_values_to_match_like_pattern | Simple prefix/suffix format checks |
expect_column_values_to_be_between | Value range checks (numeric and date) |
expect_column_mean_to_be_between | Distribution shift detection |
expect_compound_columns_to_be_unique | Composite primary key validation |
expect_column_pair_values_A_to_be_greater_than_B | Cross-column relationship validation |
expect_row_values_to_have_data_for_every_n_datepart | Time series completeness |
For test configuration patterns (severity levels, performance optimization, prod-only execution), see dbt Test Severity and Performance Tuning. For the row_condition parameter that works across nearly all of these tests, see dbt-expectations row_condition Pattern.