The Core Architecture: Separation of Storage and Compute
BigQuery’s defining innovation is complete separation between where your data lives and where queries run. Your tables live in Colossus, Google’s distributed file system, stored in a columnar format called Capacitor. Your queries execute separately via Dremel, a distributed query engine that parallelizes work across thousands of virtual processors. The Jupiter network (6+ petabits/second bandwidth) connects them. This matters because it means storage and compute costs are independent. A massive table queried once costs far less than a small table queried constantly.
Unlike traditional data warehouses (Snowflake, Redshift) where you provision and manage resources, BigQuery automatically allocates virtual compute units called slots based on query complexity. On-demand customers get access to up to 2,000 shared slots. For predictable workloads, you reserve dedicated slots.
Why Columnar Storage Changes Everything
Capacitor stores each column separately on disk. When you run SELECT user_id, event_date FROM events, BigQuery reads only those two columns. The other 47 columns? Never touched. Never billed.
BigQuery bills for data scanned, not data returned.
This makes SELECT * catastrophically expensive. A table with 100 columns costs roughly 100x more to fully scan than selecting a single column. Adding LIMIT 1000 doesn’t reduce costs—BigQuery still scans the entire dataset before applying the limit. This columnar model is why partitioning and clustering work so well: they let BigQuery skip reading columns and blocks entirely.
The cost implications are immediate and practical. The difference between a query that explicitly selects 5 columns versus scanning all 87 columns from a base model is a 94% cost reduction. For teams running daily dbt jobs, this compounds into thousands in monthly savings.
Compute: Slots and Fair Scheduling
A BigQuery slot is a virtual compute unit combining CPU, memory, and network capacity. When you submit a query, BigQuery decomposes it into stages, breaks those stages into parallel work units, and assigns slots to execute them. Complex queries use more slots. High concurrency (many queries running simultaneously) requires more slots. When demand exceeds slot capacity, queries queue and slow down.
On-demand pricing gives you access to a shared pool of roughly 2,000 slots globally. Capacity-based pricing (BigQuery Editions) lets you reserve dedicated slots. The key difference: on-demand has variable performance during peak hours; reserved slots guarantee capacity.
Fair scheduling is how BigQuery distributes slots when multiple queries compete. It divides slots equally among projects first, then among jobs within each project. This matters architecturally: if you run production dbt and ad-hoc analyst queries in the same project, they compete for the same slot pool. Separate them into different projects assigned to different reservations, and they’re isolated.
The Cost Model
Your BigQuery bill comes from three buckets:
- Compute (85-90% of spend): Bytes scanned per query, billed at $6.25/TiB
- Storage (10-15% of spend): Data stored, billed at $0.02/GB/month for active storage
- Streaming inserts (usually negligible): $0.01 per 200 MB
Because bytes scanned dominates the bill, the optimization hierarchy is clear. Reducing scan size by 50% saves far more than reducing storage by 50%. Partitioning and clustering directly reduce scan size by helping BigQuery skip irrelevant data blocks. Selecting specific columns instead of SELECT * reduces scan size. Incremental models instead of full refreshes reduce scan size.
This also means your table design choices have immediate cost consequences. Wide denormalized tables with many columns aren’t inherently bad—but scanning them carelessly becomes expensive fast. A single analyst running unfiltered SELECT * queries on a 10TB table can generate a $5,000 monthly bill from ad-hoc work alone.
Storage is Separated from Compute for a Reason
Because storage and compute are independent, you can optimize them separately. A 100GB table stored for a year costs $240 annually in storage ($0.02/GB/month). If that table is queried once, the compute cost depends entirely on how many columns you select and whether the table is partitioned/clustered. If you select a single column without partitioning, you might scan the entire table (20-50GB depending on compression). With partitioning and clustering, you might scan just 100MB.
The same table can cost $0.60 in compute (optimized query) or $375 (unoptimized query). The storage cost never changes.
Long-term storage adds another layer. Data transitions to long-term pricing automatically after 90 consecutive days without modification, dropping to $0.01/GB/month: a 50% discount requiring zero effort. For archival tables, this is free money.
Datasets and Regions: Permanent Decisions
Datasets are BigQuery’s logical containers. They don’t appear in the standard GCP hierarchy but they’re the primary unit for access control. Every table inherits permissions from its parent dataset. More importantly, location is set at dataset creation and cannot be changed afterward. A dataset created in US stays in US forever. Moving it requires exporting data, creating a new dataset in the correct region, reloading everything, and updating all references.
This immutability creates the “region mixing catastrophe”: a single misplaced dataset blocks cross-region joins and corrupts your entire query pipeline. BigQuery doesn’t allow joins across regions. The rule is absolute: all datasets joined in a single query must be in the same location. Choose your region once, document it prominently, enforce it in code review.
The Project Boundary and Quota Isolation
Projects form the fundamental organizational unit. Each project has its own billing account, its own quotas, and its own slot allocations. On-demand pricing gives each project access to up to 2,000 concurrent slots, with 20,000 shared across an organization.
This boundary is architectural, not cosmetic. It enables powerful patterns: a central data team manages raw data in one project; department teams query that data from their own projects, with compute costs flowing to their budgets. It prevents a runaway development query from starving production dashboards of resources (they’re in different projects with different quotas).
For the same reason, separating dev and prod into different projects is best practice for teams with multiple people. A shared project means ad-hoc development work competes with production pipelines for the same slot pool.
Related Notes
See Partitioning vs Clustering, BigQuery Slots and Reservations, and BigQuery Cost Model for deeper coverage of specific architectural components.