SCD Type 2 with dbt Snapshots

SCD Type 2 (Slowly Changing Dimension Type 2) preserves every version of a record by storing one row per record-version, each with validity timestamps. This allows point-in-time queries and historical analysis of how records changed over time. dbt implements SCD Type 2 through its snapshot feature.

How dbt Snapshots Work

A dbt snapshot monitors a source table and, on each run, detects which records have changed. For changed records, it closes the previous version (sets an end date) and inserts the new version. The result is a history table with three extra columns:

dbt_valid_from — When this version became active
dbt_valid_to — When this version was superseded (NULL for the current version)
dbt_is_deleted — Whether the source record was deleted

To get the current state of any record, filter for dbt_valid_to IS NULL. To see what a record looked like on a specific date, filter for dbt_valid_from <= target_date AND (dbt_valid_to > target_date OR dbt_valid_to IS NULL).

As of dbt 1.9+, snapshots can be defined in YAML rather than SQL, following the naming convention snap__[entity]:

snapshots:
  - name: snap__salesforce__opportunity
    relation: source('salesforce', 'opportunity')
    config:
      strategy: timestamp
      updated_at: system_modstamp
      unique_key: id

Timestamp Strategy

The timestamp strategy uses a record’s updated_at field to detect changes. When the timestamp changes between snapshot runs, dbt knows the record was modified.

config:
  strategy: timestamp
  updated_at: system_modstamp
  unique_key: id

Advantages:

Fast — compares a single column, not the entire row
Simple to reason about
Works well when the source reliably updates the timestamp on every change

Limitations:

Misses changes that don’t update the timestamp. In Salesforce, [[CRM Data Extraction Challenges|formula field changes don’t update SystemModStamp]]. If a formula field recalculates, the timestamp stays the same, and your snapshot misses the change entirely.
Only captures the state at snapshot execution time. If a record changes three times between daily snapshot runs, you get one version, not three.

For Salesforce, the timestamp strategy with system_modstamp is the standard approach for most objects. It’s fast and reliable for direct field changes. Pair it with formula field recomputation in dbt if you need accurate formula field tracking.

Check Strategy

The check strategy compares actual column values between runs. Instead of looking at a timestamp, it checks whether specific columns have different values than the previous snapshot.

config:
  strategy: check
  check_cols:
    - stage_name
    - amount
    - close_date
    - owner_id
  unique_key: id

You can use check_cols: all to monitor every column, but this is slow on wide tables.

Advantages:

Catches changes that don’t update timestamps, including formula field changes
More accurate for recording every meaningful state change

Limitations:

Slower, especially on wide tables — compares multiple column values per row
Still only captures state at execution time
check_cols: all can be expensive and catches irrelevant changes (metadata timestamps, sync columns)

When to use check over timestamp: When the columns you’re tracking don’t reliably update a timestamp field. Salesforce formula fields are the classic case. If you only need to track stage progression and amount changes, list those specific columns in check_cols rather than using all.

Snapshot Frequency and Its Consequences

Snapshots capture state at execution time. If you run snapshots daily at 2 AM and a record changes at 9 AM, changes again at 1 PM, and changes again at 5 PM, your next snapshot at 2 AM the following day captures only the 5 PM state. The 9 AM and 1 PM versions are lost.

This matters most for CRM data where intraday changes are common:

Opportunities can change stage multiple times in a day during active sales cycles
Contact information updates during business hours
Deal amounts get revised as negotiations progress

For most reporting use cases, daily snapshots are sufficient. You need point-in-time accuracy at day boundaries, not minute-by-minute tracking. If you truly need finer granularity, consider Fivetran History Mode or running snapshots more frequently (but be aware of the performance cost).

Fivetran History Mode

Fivetran offers an alternative to dbt snapshots with History Mode. Instead of capturing state at snapshot execution time, History Mode records every version of a record as Fivetran observes it during syncs.

History Mode adds columns to the synced table:

_fivetran_start — When this version was first observed
_fivetran_end — When this version was superseded
_fivetran_active — Whether this is the current version

The dbt_salesforce package supports History Mode with dedicated daily history models for accounts, contacts, and opportunities:

vars:
  salesforce__account_history_enabled: true
  salesforce__opportunity_history_enabled: true
  global_history_start_date: '2024-01-01'

Fivetran History Mode vs dbt Snapshots

The key difference: Fivetran History Mode captures changes as they happen during syncs, while dbt snapshots only see state at snapshot execution time.

If you sync Salesforce every 15 minutes with History Mode enabled, you capture changes at 15-minute granularity. If a record changes twice between Fivetran syncs, you still miss the intermediate state — but the granularity is much finer than daily dbt snapshots.

Feature	dbt Snapshots	Fivetran History Mode
Granularity	At snapshot execution time	At sync time
Formula fields	Check strategy catches them	Same limitation as normal sync
Configuration	In dbt project	In Fivetran dashboard + dbt vars
Storage	Separate snapshot table	Additional columns on source table
Cost	Snapshot runs use warehouse compute	Included in Fivetran sync, increases row count

When to use which:

Use dbt snapshots when you want full control, when you’re tracking formula fields with check strategy, or when you don’t use Fivetran
Use Fivetran History Mode when you want finer-grained change tracking that matches your sync frequency, especially for high-velocity objects like opportunities

You can use both together. History Mode captures frequent changes at sync granularity, while dbt snapshots can track formula field changes that History Mode misses.

Querying Snapshot Data

Point-in-time queries follow a standard pattern. To reconstruct what opportunities looked like on a specific date:

SELECT
    id AS opportunity__id,
    stage_name AS opportunity__stage,
    amount AS opportunity__amount,
    dbt_valid_from,
    dbt_valid_to
FROM {{ ref('snap__salesforce__opportunity') }}
WHERE dbt_valid_from <= '2025-06-30'
  AND (dbt_valid_to > '2025-06-30' OR dbt_valid_to IS NULL)

For stage duration analysis — how long opportunities spend in each stage — use window functions to compare consecutive versions:

WITH versioned AS (
    SELECT
        id AS opportunity__id,
        stage_name AS opportunity__stage,
        dbt_valid_from AS version__started_at,
        COALESCE(dbt_valid_to, CURRENT_TIMESTAMP()) AS version__ended_at
    FROM {{ ref('snap__salesforce__opportunity') }}
)

SELECT
    opportunity__id,
    opportunity__stage,
    DATE_DIFF(version__ended_at, version__started_at, DAY) AS days_in_stage
FROM versioned
ORDER BY opportunity__id, version__started_at

This pattern answers questions like “how long do deals typically spend in negotiation before closing?”

Practical Considerations

Storage growth. Snapshot tables grow with every change. A Salesforce org with 50,000 opportunities that averages 5 stage changes per opportunity produces 250,000 snapshot rows. Over years, these tables get large. Consider partitioning snapshot tables by dbt_valid_from and applying retention policies for very old versions that no longer serve reporting needs.

Initial snapshot. The first snapshot run captures the current state of every record as the “initial” version. There’s no history before the first run. If you need historical data from before snapshots were enabled, Fivetran History Mode can backfill if it was already capturing changes.

Deleted records. When a source record is deleted (Fivetran marks _fivetran_deleted = TRUE), dbt snapshots close the last version and set dbt_is_deleted = TRUE. The historical versions remain intact, which is useful for audit trails.

Testing snapshots. Test snapshot tables with the same rigor as regular models. unique on (unique_key, dbt_valid_from) ensures no duplicate versions exist. not_null on dbt_valid_from catches initialization issues. Check that dbt_valid_to IS NULL has exactly one row per unique key (the current version).