Idempotency and lookback windows only work if you test them. Without explicit tests, a misconfigured lookback window silently misses late records for months before anyone notices. By then, the drift is hard to quantify and harder to fix.
There are two testing layers: unit tests that verify your model logic handles late arrivals correctly, and production comparison queries that detect drift between incremental and full-refresh results.
Unit Testing Late Arrivals with dbt 1.8+
dbt’s unit testing framework (added in dbt 1.8) lets you simulate the state of your target table and verify that your model handles late-arriving records correctly. The key is overriding is_incremental() to force the model into incremental mode during the test.
unit_tests: - name: test_late_arrival_captured_within_window model: fct__events overrides: macros: is_incremental: true given: - input: this rows: - {event_id: 1, event_data: "original", updated_at: "2024-01-15"} - {event_id: 3, event_data: "existing_record", updated_at: "2024-01-14"} - input: ref('base__events') rows: - {event_id: 1, event_data: "updated", updated_at: "2024-01-17"} - {event_id: 2, event_data: "late_arrival", updated_at: "2024-01-13"} - {event_id: 3, event_data: "existing_record", updated_at: "2024-01-14"} expect: rows: - {event_id: 1, event_data: "updated"} - {event_id: 2, event_data: "late_arrival"} - {event_id: 3, event_data: "existing_record"}What this test verifies: event_id: 1 gets updated (its newer version replaces the existing record), event_id: 2 is a late arrival dated January 13 — before the current max of January 17 in {{ this }} — and it gets captured because it falls within the lookback window, and event_id: 3 passes through unchanged.
The is_incremental: true override is essential. Without it, the model runs in full-refresh mode and the {{ this }} reference has no meaning, making the test useless for verifying incremental behavior.
Testing Boundary Conditions
The lookback window boundary is where most bugs live. A record that arrives exactly at the boundary should be captured; one that arrives just outside should not. Test both:
unit_tests: - name: test_record_at_lookback_boundary_captured model: fct__events overrides: macros: is_incremental: true given: - input: this rows: # Current max is Jan 17, 3-day lookback means Jan 14 is the cutoff - {event_id: 10, event_timestamp: "2024-01-17 12:00:00", event_data: "latest"} - input: ref('base__events') rows: # This record is exactly at the boundary — should be captured - {event_id: 11, event_timestamp: "2024-01-14 00:00:01", event_data: "boundary_record"} # This record is one second before the boundary — behavior depends on your filter - {event_id: 12, event_timestamp: "2024-01-13 23:59:59", event_data: "outside_window"} expect: rows: - {event_id: 10, event_timestamp: "2024-01-17 12:00:00", event_data: "latest"} - {event_id: 11, event_timestamp: "2024-01-14 00:00:01", event_data: "boundary_record"} # event_id: 12 should NOT appear if your filter uses >= (not >)Whether the boundary record is included or excluded depends on your filter operator (>= vs >). The test makes this behavior explicit and prevents accidental changes to the logic from going unnoticed.
Testing Deduplication Within the Window
When a lookback window reprocesses records that are already in the target table, the model must update rather than duplicate them. Test this explicitly:
unit_tests: - name: test_reprocessed_records_update_not_duplicate model: fct__events overrides: macros: is_incremental: true given: - input: this rows: - {event_id: 1, event_data: "stale_value", processed_at: "2024-01-16 10:00:00"} - input: ref('base__events') rows: # Same event_id, newer value — should update, not duplicate - {event_id: 1, event_data: "fresh_value", processed_at: "2024-01-17 08:00:00"} expect: rows: - {event_id: 1, event_data: "fresh_value"} # Exactly one row — no duplicatesThis verifies that your unique_key and deduplication logic (QUALIFY/ROW_NUMBER) work together correctly when the lookback window causes reprocessing of already-processed records.
Production Validation with audit_helper
Unit tests verify that your model logic is correct. Production validation detects whether your running model is actually drifting from source truth — which can happen when lookback windows are sized incorrectly or when data arrives outside the window.
The standard pattern: compare a recent slice of your incremental model against a full refresh of the same time period.
{{ audit_helper.compare_queries( a_query="SELECT * FROM {{ ref('fct__events') }} WHERE event_date >= CURRENT_DATE - 7", b_query=""" SELECT * FROM {{ ref('base__events') }} WHERE event_date >= CURRENT_DATE - 7 QUALIFY ROW_NUMBER() OVER (PARTITION BY event_id ORDER BY updated_at DESC) = 1 """, primary_key='event_id') }}This comparison returns rows present in one query but not the other, and rows where the key exists in both but values differ. Non-empty results indicate drift.
Running this comparison weekly (or after any known pipeline issue) gives you early warning before drift compounds. A lookup window of 7 days in this comparison is intentional — it’s wider than the typical 3-day lookback window, so it can catch cases where late-arriving records just missed the window.
Interpreting Comparison Results
audit_helper’s compare_queries returns a result set with columns indicating whether each row is in_a_only, in_b_only, or in_both_or_identical. The categories have different meanings:
in_a_only(incremental has, full refresh doesn’t): Usually harmless — records in your incremental model that fall outside the comparison window. Can also indicate extra records from a deduplication bug.in_b_only(full refresh has, incremental doesn’t): These are the dangerous ones. Records your incremental model should have but doesn’t. This is late-arriving data that fell outside your lookback window.in_both_different: Records exist in both but with different values. Suggests either an update was missed or the deduplication logic picked a different version.
A high count of in_b_only records within your comparison window (but outside your lookback window) is a direct signal that your window is too narrow.
When Unit Tests Aren’t Enough
Unit tests verify logic in isolation with controlled fixture data. They can’t detect:
- Window sizing problems: If your lookback window is 3 days but 10% of records arrive within 5 days, unit tests with synthetic fixtures won’t catch the real-world miss rate.
- Source data distribution changes: The latency profile of your source system can shift over time. What worked for 3 days last year might need 7 days this year.
- Interaction effects at scale: A deduplication that works correctly on 10 test rows can have edge cases that only appear with millions of rows and a realistic distribution of duplicates.
The audit_helper comparison is what catches these. Combine unit tests for logic correctness with periodic production comparison for drift detection. Neither alone is sufficient — one tests the code, the other tests the outcome.
For the underlying patterns these tests are validating, see the lookback window pattern and idempotent incremental models.