dbt Materialization Default: Tables Everywhere

The conventional dbt materialization recommendation is views for base/staging, ephemeral for intermediate, and tables for marts. The argument here is that defaulting to tables at every layer produces more debuggable, stable, and maintainable projects. The only exceptions are incremental (when volume demands it) and view (when data must be fresh within minutes).

The conventional distribution across layers:

Layer	Conventional	Rationale
Base/Staging	`view`	Fresh data, save storage
Intermediate	`ephemeral`	Just CTEs, not real tables
Marts	`table`	Performance for end users

The Case for Tables Everywhere

Debugging requires intermediate results

When something breaks in production — and it will — you need to be able to query intermediate results. With ephemeral models, those results simply don’t exist in your warehouse. You cannot run SELECT * FROM int__session LIMIT 100 to check your sessionization logic. You cannot count rows in base__ga4__event to verify the unnesting worked. You are flying blind through a stack of Jinja-compiled CTEs, trying to reconstruct what happened.

Tables give you a checkpoint at every layer. When mrt__marketing__campaign_performance looks wrong, you can query int__session, then int__session__session_lj_conversion, then base__ga4__event. You trace the problem back to its source in minutes instead of hours.

Ephemeral models optimize for compile-time simplicity at the cost of runtime debuggability. That’s a bad trade in production.

Views cascade schema breaks instantly

Views are re-evaluated on every query. If an upstream source adds, removes, or renames a column, every downstream view breaks immediately — at query time, in front of your dashboard users.

Tables act as a buffer. Your source changes its schema. Your pipeline runs, your base model fails to build (a controlled failure you can investigate), and your tables still contain yesterday’s data. The dashboard still works. You have time to fix the issue before it cascades. That’s the difference between a pipeline incident and a data incident.

Storage is cheap relative to compute

The common justification for views is cost: avoiding storage you don’t need. On modern cloud warehouses, storage costs are negligible compared to compute.

BigQuery charges $0.02 per GB per month for active storage. A table with 10 million rows might cost $0.50 per month to store. An intermediate model with a year of session data — 500 million rows, let’s say — might cost $3 per month.

Views recompute on every query. Every dashboard reload, analyst query, and downstream model build scans the entire upstream chain. A single BigQuery query scanning 500 GB costs $2.50. Run that ten times a day in a dashboard and the compute cost is $750 per month — against a $3 monthly storage cost for the equivalent table.

One strategy is easier to teach and maintain

Consistency reduces cognitive overhead. With tables as the default, there is one rule: everything is a table, incremental if large, view if genuinely real-time. The exception cases are obvious because they’re exceptions. New team members onboard to a simpler mental model, and dbt_project.yml conventions are straightforward.

When to Override the Default

The “tables everywhere” default has two legitimate exceptions.

Incremental when volume demands it

Switch to incremental materialization when:

Tables exceed millions of rows
Daily appends are the dominant pattern
Full refresh takes more than 5 minutes

For marketing analytics specifically: GA4 events tables are the primary candidate (incremental with 3-day lookback for late-arriving events). Ad platform daily extracts are another (incremental by date, with a 30-day lookback for attribution window updates).

The threshold matters. Incremental models are meaningfully more complex — they require careful thought about your incremental strategy, late-arriving data handling, and edge cases at the boundary of full refresh vs. incremental runs. Don’t add that complexity until table materialization is genuinely a bottleneck. For most tables under 10 million rows, the full refresh cost is trivial.

Views when data genuinely needs to be fresh within minutes

Some use cases require live data: fraud detection feeding a real-time decision engine, inventory levels for an ecommerce platform, campaign performance during a high-stakes launch. For these, the query-time compute cost is an acceptable trade for freshness.

Most analytics use cases tolerate hourly or daily latency. Before adding a view for “freshness,” verify whether anyone has actually reported data being stale and at what granularity they need it current.

Configuring This in dbt_project.yml

Set the default at the project level and override only where necessary:

models:
  my_project:
    +materialized: table  # Default everything to table
    base:
      +schema: base
      ga4:
        +materialized: incremental  # High volume
        +incremental_strategy: insert_overwrite
    intermediate:
      +schema: intermediate
    marts:
      +schema: marts

This configuration is declarative and reviewable. Anyone reading it can see that GA4 base models are incremental because of volume, and everything else is a table. The exception is explicit; the default is implicit.

On Ephemeral Models

Ephemeral models are inlined as CTEs into every downstream model that references them. An ephemeral model referenced by three marts results in that CTE being inlined into three separate SQL queries — the same transformation runs three times. A table runs once, stores the result, and is referenced by all three marts. No redundant computation, queryable, consistent. The only legitimate reason to use ephemeral is to avoid database object count limits, which is a warehouse constraint issue, not a modeling choice.

On View Freshness

Views re-evaluate on every query, which means every source schema change breaks them at query time, dashboard queries are slow when upstream tables are large, and intermediate logic recomputes from scratch on every query. Tables give control over when data refreshes — at pipeline run time, not at dashboard open time. That predictability is more valuable than marginal freshness improvement in most analytics use cases.

Summary

Views and ephemeral models optimize for avoiding recomputation. Tables optimize for reliability: queryable, debuggable, stable outputs. The incremental exception is a performance optimization applied when table refreshes become genuinely expensive. The view exception is a freshness optimization applied only when real-time data is required. Everything else: table.