dbt-expectations Test Reference

A reference for the highest-value dbt-expectations tests by category — tests that fill gaps native dbt tests cannot cover. All examples use BigQuery-compatible SQL. dbt-expectations ships 50+ tests total; this note covers the most broadly applicable.

Table-Level Tests

expect_row_values_to_have_recent_data

Native dbt only offers freshness checks on sources. This test works on any model, catching stale data before dashboards show outdated numbers.

models:
  - name: mrt__sales__orders
    columns:
      - name: order_timestamp
        tests:
          - dbt_expectations.expect_row_values_to_have_recent_data:
              datepart: hour
              interval: 24

This fails if no rows have an order_timestamp within the last 24 hours. For GA4 data, which typically has a 24-48 hour delay, set interval: 48. For daily batch models, use datepart: day with interval: 2 to allow for weekends or holidays.

expect_table_row_count_to_equal_other_table

Transformations should not silently drop rows. This test catches it when they do:

models:
  - name: mrt__sales__orders
    tests:
      - dbt_expectations.expect_table_row_count_to_equal_other_table:
          compare_model: ref('base__shopify__orders')

If your base model has 50,000 rows and your mart has 49,000, something went wrong in between. This is particularly useful during refactoring — when you rewrite a model, the row count should match the original unless you have a specific reason for it to differ.

expect_table_row_count_to_be_between

Detects unexpected volume changes. If your daily batch normally contains 10,000-100,000 rows and suddenly has 500, you want to know before anyone opens a dashboard:

models:
  - name: base__ga4__events
    tests:
      - dbt_expectations.expect_table_row_count_to_be_between:
          min_value: 10000
          max_value: 100000

Query your data’s historical volumes to set reasonable bounds. If you’d rather not maintain static thresholds, Elementary’s volume_anomalies test learns the pattern automatically. Use expect_table_row_count_to_be_between when you know the bounds; use Elementary when you want adaptive detection.

Pattern Validation

expect_column_values_to_match_regex

Native dbt has no format validation tests. This test covers that category:

columns:
  - name: customer_email
    tests:
      - dbt_expectations.expect_column_values_to_match_regex:
          regex: '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$'

Other common patterns:

GA4 user_pseudo_id: '^[0-9]+\\.[0-9]+$'
GA4 event_name: '^[a-z_]+$'
Product SKU like PRD-1234: '^PRD-[0-9]{4}$'
ISO country code: '^[A-Z]{2}$'

Always combine regex tests with row_condition on nullable columns. Without it, NULL values fail the regex and produce false positives.

expect_column_values_to_match_like_pattern

When regex is overkill, use SQL LIKE patterns instead. Simpler to read, sufficient for prefix/suffix checks:

columns:
  - name: product_sku
    tests:
      - dbt_expectations.expect_column_values_to_match_like_pattern:
          like_pattern: 'PRD-%'

Value Range Validation

expect_column_values_to_be_between

Validates that values fall within defined bounds. Applies to numeric and date columns:

columns:
  - name: order_value
    tests:
      - dbt_expectations.expect_column_values_to_be_between:
          min_value: 0
          max_value: 1000000

  - name: conversion_rate
    tests:
      - dbt_expectations.expect_column_values_to_be_between:
          min_value: 0
          max_value: 1

  - name: event_date
    tests:
      - dbt_expectations.expect_column_values_to_be_between:
          min_value: "'2020-01-01'"
          max_value: "current_date()"

Note the quoting for string and date values: the outer quotes are YAML, the inner quotes are SQL. "'2020-01-01'" becomes '2020-01-01' in the compiled query.

expect_column_mean_to_be_between

Catches distribution shifts that row-level range checks miss. Individual values may all be within bounds while the overall distribution is anomalous:

columns:
  - name: order_value
    tests:
      - dbt_expectations.expect_column_mean_to_be_between:
          min_value: 50
          max_value: 200

Query the data first to establish reasonable bounds (SELECT AVG(order_value) FROM your_model). Bounds should be wide enough that normal fluctuations pass, narrow enough that real anomalies fail.

For columns where the mean is meaningless (bimodal distributions, heavy outliers), pair this with expect_column_median_to_be_between or use Elementary’s column_anomalies instead.

Multi-Column Validation

expect_compound_columns_to_be_unique

Native unique only works on a single column. For tables with a composite primary key:

models:
  - name: mrt__sales__order_lines
    tests:
      - dbt_expectations.expect_compound_columns_to_be_unique:
          column_list: ["order_id", "line_item_id"]

Apply this to every table with a composite primary key. Join explosions from duplicate composite keys are among the hardest bugs to diagnose because they silently inflate metrics rather than producing errors.

expect_column_pair_values_A_to_be_greater_than_B

Validates business logic that spans two columns:

models:
  - name: mrt__finance__subscriptions
    tests:
      - dbt_expectations.expect_column_pair_values_A_to_be_greater_than_B:
          column_A: end_date
          column_B: start_date
          or_equal: true
          row_condition: "end_date is not null"

Common use cases:

shipped_date > order_date
total_amount >= subtotal
updated_at >= created_at
end_date >= start_date

The or_equal: true parameter matters — without it, records where both dates are equal (same-day subscriptions, instant shipments) would fail.

Completeness Tests

expect_row_values_to_have_data_for_every_n_datepart

Detects gaps in time series data. If you’re missing an entire day of GA4 events, this test fails:

columns:
  - name: event_date
    tests:
      - dbt_expectations.expect_row_values_to_have_data_for_every_n_datepart:
          date_col: event_date
          date_part: day
          test_start_date: "'2024-01-01'"
          test_end_date: "current_date() - 1"

Always specify date bounds. Without test_start_date and test_end_date, this test scans your entire table history. On a large GA4 dataset, that means scanning millions of rows to check for gaps going back to the beginning of time. Bound it to a recent window (90 days, a year) to keep costs manageable.

For BigQuery specifically, use partition-aware date expressions:

test_start_date: "date_sub(current_date(), interval 90 day)"
test_end_date: "date_sub(current_date(), interval 2 day)"

The interval 2 day end bound accounts for GA4’s typical processing delay — you don’t want the test failing because today’s data hasn’t arrived yet.

Quick Reference

Test	Use case
`expect_row_values_to_have_recent_data`	Freshness checks on any model
`expect_table_row_count_to_equal_other_table`	Verify transformations don’t drop rows
`expect_table_row_count_to_be_between`	Detect volume anomalies
`expect_column_values_to_match_regex`	Format validation (emails, IDs, SKUs)
`expect_column_values_to_match_like_pattern`	Simple prefix/suffix format checks
`expect_column_values_to_be_between`	Value range checks (numeric and date)
`expect_column_mean_to_be_between`	Distribution shift detection
`expect_compound_columns_to_be_unique`	Composite primary key validation
`expect_column_pair_values_A_to_be_greater_than_B`	Cross-column relationship validation
`expect_row_values_to_have_data_for_every_n_datepart`	Time series completeness

For test configuration patterns (severity levels, performance optimization, prod-only execution), see dbt Test Severity and Performance Tuning. For the row_condition parameter that works across nearly all of these tests, see dbt-expectations row_condition Pattern.