ServicesAboutNotesContact Get in touch →
EN FR
Note

BigQuery Cost Model

How BigQuery pricing works across on-demand and editions models — bytes billed, slot hours, storage costs, and optimization levers

Planted
bigquerygcpcost optimization

BigQuery’s cost model follows from its architecture: storage and compute are separated and billed independently. Compute dominates (85-90% of total spend for most teams); storage is typically 10-15%.

The Two Pricing Models

BigQuery offers two ways to pay for compute:

On-Demand: Pay Per Byte Scanned

On-demand pricing charges $6.25 per TiB of data scanned, with the first 1 TiB free monthly per billing account. You get access to a shared pool of roughly 2,000 slots (virtual compute units), and BigQuery handles allocation automatically. No capacity planning required.

The critical mechanic: you pay for bytes read from storage, not bytes returned to you. A query that scans a 5 TB table and returns 10 rows costs the same as one that returns 10 million rows. LIMIT 1000 does not reduce your bill — BigQuery scans the full dataset before applying the limit. This is a direct consequence of BigQuery’s columnar storage engine.

Editions: Pay Per Slot-Hour

Editions pricing (introduced July 2023, replacing legacy flat-rate) charges for compute time instead of data scanned. A slot is a virtual CPU with memory and network resources. You configure a baseline (guaranteed minimum, always billed) and a maximum (upper bound for autoscaling).

Three tiers exist:

EditionCost/Slot-Hour1-Year Commit3-Year CommitBest For
Standard$0.04N/AN/ADev/test, max 1,600 slots
Enterprise$0.06$0.048 (20% off)$0.036 (40% off)Production, ML, BI Engine
Enterprise Plus$0.10$0.08 (20% off)$0.06 (40% off)Regulated industries, DR

Autoscaling operates in 50-slot increments with a 60-second minimum billing window. A query that uses 100 slots for 5 seconds is billed for 100 slot-minutes, not 8.3 slot-seconds. This makes Editions relatively expensive for workloads consisting of many short queries. A practical rule of thumb: apply a 1.5x multiplier to theoretical slot-hour costs when estimating real-world autoscaling expenses.

Standard Edition lacks BigQuery ML, BI Engine, materialized view creation, and security features (CMEK, VPC-SC, row/column-level security). On-demand maintains feature parity with Enterprise Plus except for continuous queries and managed disaster recovery.

The Breakeven Between Models

The crossover depends on volume and workload shape. 100 slots running continuously on Standard Edition PAYG costs $2,920/month, equivalent to processing 467 TiB on-demand. If you process more than 400-500 TiB monthly with consistent patterns, Editions likely saves money.

But burst workloads shift the math dramatically. Processing 20 TiB in under an hour costs $125 on-demand versus roughly $5 for 100 Standard slots for one hour — 95% savings.

The self-assessment query to run against your own data:

WITH cost_analysis AS (
SELECT
DATE_TRUNC(creation_time, MONTH) AS month,
SUM(total_bytes_billed) / POW(1024, 4) * 6.25 AS on_demand_cost,
SUM(total_slot_ms) / 1000 / 3600 * 0.06 * 1.5 AS enterprise_cost
FROM `region-us`.INFORMATION_SCHEMA.JOBS
WHERE creation_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 90 DAY)
AND job_type = 'QUERY'
AND statement_type != 'SCRIPT'
GROUP BY month
)
SELECT
month,
ROUND(on_demand_cost, 2) AS on_demand_usd,
ROUND(enterprise_cost, 2) AS enterprise_usd,
ROUND(enterprise_cost / NULLIF(on_demand_cost, 0) * 100, 1) AS editions_pct
FROM cost_analysis
ORDER BY month DESC;

If editions_pct exceeds 75%, stay on-demand. Below 50%, Editions warrants serious evaluation. Between 50-75% is a gray zone where concurrency needs, cost predictability, and feature requirements should drive the decision.

Compute Costs: Where 85-90% of the Bill Lives

At $6.25/TiB on-demand, scanning 10 TB daily costs $1,875/month in compute. That same 10 TB stored costs $200/month. A 50% reduction in bytes scanned saves $937 monthly; a 50% reduction in storage saves $100. The optimization hierarchy is clear.

Three levers control the vast majority of compute costs:

Column selection. BigQuery’s columnar storage means each column is stored and read separately. Selecting 5 columns instead of 50 from a wide table can cut costs by 90%. SELECT * is one of the most expensive patterns in BigQuery. On a 5 TB table with 10 equal columns, SELECT * costs $31.25 while SELECT col1, col2 costs $6.25.

Partition pruning. A partitioned table with proper filters typically reduces bytes scanned by 70-90%. The same query on the same table can cost 17x more depending on how you filter: WHERE event_date = '2024-01-15' scans only one partition, while WHERE DATE(event_timestamp) = '2024-01-15' applies a function to the partition column, defeating pruning entirely. Use require_partition_filter = true to prevent accidental full-table scans.

Clustering. Clustering sorts data within partitions by up to four columns, enabling block-level pruning. Unlike partitioning, clustering benefits don’t show up in dry-run estimates — a dry run might estimate 50 GB but the actual query scans 5 GB. Column order matters: place your most frequently filtered column first. The Shopify case study is instructive: adding clustering on filter columns reduced scans from 75 GB to 508 MB, a 150x improvement that prevented $949,000 in monthly costs.

Storage Costs: Smaller but Worth Optimizing

Storage comes in two billing modes:

Logical billing (default): $0.02/GB/month for active data, $0.01/GB/month for long-term (90+ days without modification). Time travel and fail-safe storage are included free.

Physical billing: $0.04/GB/month for compressed data. Time travel and fail-safe billed separately. Physical billing wins when your compression ratio exceeds 2:1. Since typical data compresses 6-17x, most organizations save by switching. Query INFORMATION_SCHEMA.TABLE_STORAGE to check your actual compression ratios before deciding.

Streaming inserts cost $0.01 per 200 MB with a 1 KB minimum per row. Batch loading through BigQuery’s shared slot pool is completely free. For a table receiving 10 GB daily, streaming costs $15/month while batch loading costs nothing. Unless you need sub-minute latency, batch wins.

The Expensive Mistakes

CTE double-counting. BigQuery does not materialize non-recursive CTEs. Each reference re-executes the underlying scan. A CTE referenced twice in the same query costs twice as much. Materialize expensive CTEs to temporary tables when you need to reference them multiple times.

Self-joins. Google’s documentation warns these “can square the number of output rows.” Replace with window functions (LEAD, LAG, ROW_NUMBER) wherever possible — they accomplish most self-join use cases with a single table scan.

Missing max_bytes_billed. Without explicit limits, a single malformed query can scan petabytes. Set this in dbt’s profiles.yml as a safety net:

prod:
type: bigquery
maximum_bytes_billed: 107374182400 # 100 GB

Queries exceeding this limit fail before scanning, catching misconfigured incremental models that accidentally run full refreshes.

Governance as Cost Control

For organizations with ad-hoc analytics users, governance controls may deliver more savings than technical optimizations. Google’s own example: 10 users running SELECT * ten times monthly on a 10 TB table generates over $5,000 in monthly compute on top of $200 in storage.

Project-level quotas: New BigQuery projects default to a 200 TiB daily quota as of September 2025. Set per-user and per-project limits to prevent individual analysts from monopolizing budget.

Authorized views: Instead of granting direct table access (which allows SELECT *), create views exposing only necessary columns. Users get the data they need without the ability to scan entire wide tables.

Separate service accounts per integration (dbt, Looker, Fivetran) enable cost attribution by system. Query INFORMATION_SCHEMA.JOBS_BY_PROJECT by user_email to find who drives spend. Often 3-5 queries or service accounts account for 70%+ of compute.

The Optimization Hierarchy

  1. Query patterns (minutes to implement, 80%+ impact): Stop using SELECT *, always filter on partition columns, cluster tables over 64 MB
  2. Incremental models (hours, 70% savings potential): Switch large tables from full refresh to incremental, use insert_overwrite with copy_partitions: true on BigQuery
  3. Governance (prevents surprise bills): Set max_bytes_billed, implement daily quotas, create authorized views
  4. Pricing model (20-40% savings): Evaluate Editions if processing 400+ TiB monthly or running burst-heavy workloads
  5. Storage (10-30% savings): Evaluate physical vs. logical billing, set expiration policies on temporary tables, let long-term discounts work automatically

Start by querying INFORMATION_SCHEMA.JOBS_BY_PROJECT to find your top 5 most expensive queries. For each one, check: does it use SELECT *? Does it filter on the partition column? Is the table clustered on the columns in WHERE clauses? Fixing those three issues on your top queries likely captures most available savings.