BigQuery Materialized Views

Materialized views precompute aggregation results and refresh incrementally. Subsequent queries read from the precomputed result rather than rescanning the base table.

BigQuery materialized views support transparent query rewriting: the optimizer automatically rewrites queries against the base table to use the MV when it detects a match. No changes to consumer queries are required.

Creating a Materialized View

The core pattern: define the aggregation you want to precompute, configure refresh behavior, and set staleness tolerance.

CREATE MATERIALIZED VIEW `project.dataset.mv_daily_revenue`
OPTIONS (
  enable_refresh = true,
  refresh_interval_minutes = 30,
  max_staleness = INTERVAL '4' HOUR
)
AS
SELECT
  DATE(transaction_time) AS transaction_date,
  product_category,
  SUM(amount) AS total_revenue,
  COUNT(*) AS transaction_count
FROM `project.dataset.transactions`
GROUP BY 1, 2;

Three configuration options control the behavior:

enable_refresh: Turns on automatic incremental refresh. BigQuery tracks which rows in the base table have changed since the last refresh and only recomputes those. This is fundamentally different from a scheduled query that recomputes everything.

refresh_interval_minutes: How often BigQuery checks for changes. Setting this to 30 means the MV refreshes at most every 30 minutes. Shorter intervals give fresher data but consume more compute.

max_staleness: This is the key tuning lever. It tells BigQuery how old the MV data can be before it must refresh. Setting INTERVAL '4' HOUR says “I’m okay with data that’s up to 4 hours old.” This lets the optimizer use the MV more aggressively and reduces refresh frequency. For real-time dashboards, omit it (the default is always-fresh). For weekly reports, set it to days.

Transparent Query Rewriting

This is where MVs differ from simply creating a summary table. When you query the base table:

SELECT
  DATE(transaction_time) AS transaction_date,
  SUM(amount) AS total_revenue
FROM `project.dataset.transactions`
WHERE DATE(transaction_time) >= '2025-01-01'
GROUP BY 1;

BigQuery recognizes that this query can be answered from mv_daily_revenue and rewrites it automatically. The user doesn’t know or care that an MV exists. The query planner handles the routing.

This works because BigQuery can verify that the MV result is equivalent to (or derivable from) the original query. It handles aggregation rollups: if the MV groups by product_category and your query doesn’t, BigQuery can aggregate across categories from the MV result instead of the base table.

What MVs Can and Cannot Do

Supported: Simple aggregations (SUM, COUNT, AVG, MIN, MAX, COUNT DISTINCT, HLL_COUNT.INIT, APPROX_COUNT_DISTINCT), GROUP BY, WHERE filters, INNER JOIN to dimension tables, and UNION ALL of tables.

Not supported: ORDER BY, LIMIT, HAVING, subqueries in the SELECT clause, non-deterministic functions like CURRENT_TIMESTAMP(), window functions, outer joins, and cross joins.

The restriction on non-deterministic functions is the one that catches people most often. If your aggregation includes CURRENT_TIMESTAMP() for a “last refreshed” column, BigQuery can’t do incremental refresh because the value changes on every evaluation. The MV falls back to full recomputation, defeating the purpose entirely.

Keep MVs simple: aggregations over filtered base tables. Complex transformations belong in your dbt models.

Staleness as a Design Decision

The default behavior (no max_staleness) means the MV must be perfectly fresh before BigQuery will use it. In practice, this means frequent refreshes and higher compute costs for keeping the MV updated.

Most dashboard consumers can tolerate some staleness. Ask the question explicitly: does this report need data from the last 5 minutes, or is data from the last 2 hours fine? The answer determines your max_staleness setting, which in turn determines your refresh cost.

A reasonable starting point for different use cases:

Use Case	max_staleness	Rationale
Executive dashboards	4-8 hours	Updated a few times daily is sufficient
Operational monitoring	30-60 minutes	Near-real-time without the cost of always-fresh
Weekly reports	1-2 days	Only needs to be current on report day
Real-time alerting	Omit (always fresh)	Staleness defeats the purpose

Performance Impact

Google reports queries scanning 500 GB reduced to 2.5 MB with properly designed MVs — a 200,000x reduction. Typical improvements:

Query speedup: 5-100x for complex aggregations
Bytes scanned: reduced by the ratio of raw data to aggregated data
Slot consumption: proportional reduction since less data means less compute

The refresh cost is the tradeoff. Each refresh scans the changed data in the base table and recomputes affected aggregations. For append-heavy tables (like event logs), incremental refreshes are cheap because only new rows are processed. For tables with widespread updates, refreshes approach full recomputation.

Limits and Constraints

BigQuery imposes hard limits:

20 MVs per dataset
100 MVs per project
MV and base table must be in the same dataset
Base table must be a native BigQuery table (not external or federated)

The 20-per-dataset limit is the one most teams hit first. It forces you to be intentional about which aggregations deserve materialization. Prioritize MVs for queries that run frequently against large tables with stable aggregation patterns.

MVs vs dbt Incremental Models

There’s overlap between materialized views and dbt incremental models. Both precompute results to avoid repeated full scans. The distinction:

MVs are managed by BigQuery, refresh automatically, and support transparent query rewriting. They’re best for dashboard-facing aggregations where consumers query the base table directly.
dbt incremental models are managed by your pipeline, offer full SQL flexibility (joins, window functions, complex logic), and produce regular tables. They’re best for transformations in your modeling layer.

MVs work well as a final optimization layer: after dbt models produce clean, well-structured tables, an MV on top accelerates dashboard queries without modifying Looker or Tableau code.

Creating MVs with dbt

dbt supports materialized views as a materialization strategy (introduced in dbt 1.7 for BigQuery):

{{ config(
    materialized='materialized_view',
    on_configuration_change='apply'
) }}

SELECT
  DATE(transaction_time) AS transaction_date,
  product_category,
  SUM(amount) AS total_revenue,
  COUNT(*) AS transaction_count
FROM {{ ref('fct_transactions') }}
GROUP BY 1, 2

The on_configuration_change='apply' setting tells dbt to update the MV definition if the SQL changes. Without it, dbt won’t detect or apply configuration drift.

The same restrictions apply: no complex SQL features, no non-deterministic functions, no window functions. If your dbt model needs any of these, stick with incremental or table materialization.

When to Use Materialized Views

MVs are appropriate when:

The same aggregation runs repeatedly (dashboards, reports, API endpoints)
The base table is large enough that scan costs matter
The aggregation is simple enough to fit within MV constraints
Consumers can tolerate some data staleness
The 20-per-dataset and 100-per-project limits are not exceeded

When aggregations are complex, require joins to multiple tables, or use window functions, use a dbt model instead. MVs are optimized for the simple, high-frequency case. The cost model ROI calculation is: refresh cost vs. cumulative scan cost of running the aggregation directly.