Incremental models have two code paths — one for full refresh, another for normal incremental loads — and teams typically test only one. The full refresh path is tested because that’s what dbt run does during development.
A common failure mode: the model works in dev and on full refresh, but a WHERE clause bug in the incremental path produces duplicates that accumulate across months of daily runs before anyone notices.
Unit tests catch these bugs before deployment by simulating both execution paths with controlled test data.
The Dual-Mode Testing Pattern
Every incremental model needs at least two unit tests: one for full refresh mode, one for incremental mode. This isn’t optional. If you’re only testing one mode, you’re leaving half your logic untested.
-- models/intermediate/int__events_processed.sql{{ config(materialized='incremental', unique_key='event_id') }}
select event_id, user_id, event_type, event_timestamp, processed_atfrom {{ ref('base__ga4__events') }}
{% if is_incremental() %}where event_timestamp > (select max(event_timestamp) from {{ this }}){% endif %}The {% if is_incremental() %} block is where bugs hide. Common mistakes include off-by-one errors (using >= vs >), timezone mismatches, and incorrect column references.
Testing Full Refresh
unit_tests: - name: test_int_events_processed_full_refresh model: int__events_processed description: "On full refresh, all source rows should be processed" overrides: macros: is_incremental: false given: - input: ref('base__ga4__events') rows: - {event_id: 1, user_id: 100, event_type: "click", event_timestamp: "2024-06-01 10:00:00"} - {event_id: 2, user_id: 100, event_type: "purchase", event_timestamp: "2024-06-02 11:00:00"} - {event_id: 3, user_id: 101, event_type: "click", event_timestamp: "2024-06-03 12:00:00"} expect: rows: - {event_id: 1, user_id: 100, event_type: "click"} - {event_id: 2, user_id: 100, event_type: "purchase"} - {event_id: 3, user_id: 101, event_type: "click"}Setting is_incremental: false in overrides simulates dbt run --full-refresh. All three events appear in the output — no filtering happens.
Testing Incremental Mode
unit_tests: - name: test_int_events_processed_incremental model: int__events_processed description: "On incremental run, only new events should be inserted" overrides: macros: is_incremental: true given: - input: ref('base__ga4__events') rows: - {event_id: 1, user_id: 100, event_type: "click", event_timestamp: "2024-06-01 10:00:00"} - {event_id: 2, user_id: 100, event_type: "purchase", event_timestamp: "2024-06-02 11:00:00"} - {event_id: 3, user_id: 101, event_type: "click", event_timestamp: "2024-06-03 12:00:00"} - input: this rows: - {event_id: 1, user_id: 100, event_type: "click", event_timestamp: "2024-06-01 10:00:00"} expect: rows: - {event_id: 2, user_id: 100, event_type: "purchase"} - {event_id: 3, user_id: 101, event_type: "click"}Setting is_incremental: true forces incremental mode. The input: this mocks the current state of the target table — it contains one event with timestamp 2024-06-01 10:00:00. The model’s WHERE clause filters to events after that timestamp, so only events 2 and 3 appear.
The Critical Insight: Expect Shows Inserts, Not Final State
This trips up many developers. The expect block represents what gets inserted or merged, not the final table state. In the incremental test above, event_id 1 already exists in this, so only events 2 and 3 appear in the expected output. You’re testing the transformation logic, not the merge operation itself.
Testing Merge Logic
The merge_update_columns config adds complexity. When a row with an existing unique_key arrives, BigQuery updates only the specified columns, leaving others unchanged.
{{ config( materialized='incremental', unique_key='user_id', merge_update_columns=['email', 'updated_at']) }}unit_tests: - name: test_int_users_current_merge_update model: int__users_current description: "Existing users should have email updated, not created_at" overrides: macros: is_incremental: true given: - input: ref('base__crm__users') rows: - {user_id: 1, email: "new@example.com", created_at: "2024-06-15", updated_at: "2024-06-15"} - input: this rows: - {user_id: 1, email: "old@example.com", created_at: "2024-01-01", updated_at: "2024-01-01"} expect: rows: - {user_id: 1, email: "new@example.com", created_at: "2024-06-15", updated_at: "2024-06-15"}Important caveat: this test validates that your transformation logic produces the correct output row. It doesn’t test the merge behavior itself — that’s BigQuery’s responsibility. The expected output shows created_at: "2024-06-15" because that’s what the source data contains. In practice, the actual merge would preserve the original created_at: "2024-01-01" from this. For more complex merge scenarios, consider dbt-audit-helper to compare actual results after a test run.
Testing Late-Arriving Data
Real-world pipelines rarely receive events in perfect chronological order. A common pattern uses a lookback window: instead of only processing events newer than max(event_timestamp), you also include events that arrived recently (based on a _loaded_at column) even if their event timestamp is older.
unit_tests: - name: test_int_events_processed_late_arriving model: int__events_processed description: "Late-arriving events within lookback window should be captured" overrides: macros: is_incremental: true vars: lookback_hours: 24 given: - input: ref('base__ga4__events') rows: - {event_id: 4, event_timestamp: "2024-06-01 08:00:00", _loaded_at: "2024-06-02 10:00:00"} - {event_id: 5, event_timestamp: "2024-06-02 09:00:00", _loaded_at: "2024-06-02 09:05:00"} - input: this rows: - {event_id: 1, event_timestamp: "2024-06-01 10:00:00"} expect: rows: - {event_id: 4, event_timestamp: "2024-06-01 08:00:00"} - {event_id: 5, event_timestamp: "2024-06-02 09:00:00"}Event 4 has an old timestamp (June 1, 08:00) but a recent load time (June 2, 10:00) — it arrived late. A naive WHERE event_timestamp > max(event_timestamp) would miss it. The lookback window ensures it gets captured. See Testing Late-Arriving Data Handling in dbt for deeper coverage of this testing pattern.
Prerequisite: Schema Must Exist
Before running incremental unit tests, build your incremental models with:
dbt run --select "config.materialized:incremental" --emptyUnit tests need to introspect the table structure. They’ll fail with “Not able to get columns for unit test” if the target table doesn’t exist. This creates empty tables with the right schema without processing any data.
Notes on the Dual-Mode Pattern
Incremental merge bugs accumulate silently across runs. The dual-mode pattern — testing both is_incremental: false and is_incremental: true with explicit this mocking — catches these bugs before deployment, when the bad state is still recoverable.