The full_refresh: false Guard in dbt

The full_refresh: false config prevents dbt run --full-refresh from triggering a full rebuild on large incremental models — billions of rows reprocessed, hours of compute time, a multi-hundred-dollar BigQuery bill or a Snowflake credit spike — while keeping intentional full refreshes possible through an explicit override.

How It Works

Add full_refresh=false to your model’s config block:

{{ config(
    materialized='incremental',
    unique_key='event_id',
    incremental_strategy='merge',
    full_refresh=false
) }}

With this set, running dbt run --full-refresh --select my_model behaves the same as a regular incremental run. The --full-refresh flag is silently ignored for this model.

To trigger a full refresh when needed, the model-level config must be temporarily removed or overridden — making the rebuild a deliberate, explicit action rather than a side effect of a broader command.

When to Use It

full_refresh: false is appropriate for:

Very large tables where full refresh takes hours. A fact table with 5 years of event data might take 6+ hours to fully rebuild. That’s not something you want happening because someone ran dbt run --full-refresh against the whole project without realizing this model was included.

Tables with significant compute cost per rebuild. If a full refresh costs $500 in BigQuery compute, adding full_refresh: false forces that to be a conscious decision rather than a side effect of a broader command.

Models downstream of external data sources. If your incremental model reads from an expensive API or a source that has limited historical availability, a full refresh might not even be possible — and if it runs, it may fail or produce incomplete data.

It’s less useful for small incremental models where a full refresh is fast and cheap. The complexity cost of managing the guard isn’t worth it if a rebuild takes 5 minutes.

The Tradeoff: Drift Accumulates Without Full Refreshes

full_refresh: false solves accidental rebuilds, but it doesn’t solve the underlying problem that incremental models drift from source truth over time.

Every lookback window has a limit. Records that arrive after the window closes get missed permanently. External systems have outages that delay data by weeks. CDC pipelines occasionally drop events. None of these are caught by a daily incremental run, no matter how well-configured the lookback.

The full refresh is the safety net that resets accumulated drift. If you prevent all full refreshes with full_refresh: false, drift compounds indefinitely. The model slowly diverges from what a correct rebuild would produce.

The right approach is a layered strategy:

Daily incremental runs catch recent data, including late arrivals within the lookback window
Periodic full refreshes reset drift — weekly or monthly depending on your tolerance and table size
On-demand full refresh after known pipeline issues, source system outages, or significant schema changes

With full_refresh: false, the periodic and on-demand refreshes still happen — they just require explicit action rather than being possible to trigger accidentally.

How to Actually Run a Full Refresh When You Need One

Since full_refresh: false ignores the CLI flag, you have two options for forcing a rebuild:

Option 1: Temporarily edit the model config. Change full_refresh=false to full_refresh=true, run the full refresh, then revert. This is explicit and leaves a git history of the intentional rebuild.

Option 2: Use a project-level variable override. Some teams set up a variable-driven approach:

{{ config(
    materialized='incremental',
    unique_key='event_id',
    full_refresh=false if var('protect_full_refresh', true) else true
) }}

Then run full refreshes with:

dbt run --full-refresh --select my_model --vars '{"protect_full_refresh": false}'

This keeps the guard in place by default but allows override without touching the model file. It’s more ceremony than option 1, but it lets you rebuild without a code change — useful in CI/CD environments where code changes trigger review workflows.

Setting It at the Project Level

If you have many large models you want to protect, you can set the default at the project or folder level in dbt_project.yml rather than on each model:

models:
  my_project:
    mart:
      +full_refresh: false

This applies full_refresh: false to all models in the mart folder. Individual models can override it if needed. This approach makes the protection a project-wide policy rather than a per-model decision, which is easier to maintain but requires team buy-in on the policy.

Relationship to the Full Refresh Safety Net

No lookback window catches everything. A record with an event timestamp from 6 months ago that arrives today will be missed by any practical window size. This is expected — the incremental model is an optimization, not a guarantee of perfect consistency.

The periodic full refresh is what closes that gap. But if rebuilding takes 6 hours and costs hundreds of dollars, teams stop running it. The model drifts further, the gap between incremental and source truth grows, and eventually someone notices that the numbers don’t add up.

full_refresh: false solves the accidental rebuild problem. Scheduled full refreshes — perhaps via a separate dbt job that runs weekly on a Saturday — solve the drift problem. Together, they form the operational foundation for a large incremental model that stays trustworthy over time.

For tables where even a weekly full refresh is impractical due to size, the audit_helper comparison pattern (described in testing late-arriving data) lets you quantify drift and make an informed decision about when a rebuild is worth the cost.