A BigQuery slot is a virtual compute unit bundling CPU, memory, and network resources. When a query runs, BigQuery assembles slots to process it in parallel. More available slots means more parallel work. When demand exceeds supply, queries slow down.
Slots are the foundation of BigQuery’s capacity model. BigQuery Editions, reservations, baseline and autoscaling, and fair scheduling are all mechanisms for managing slot allocation.
How Queries Use Slots
When you submit a SQL query, BigQuery doesn’t just run it line by line. Instead, it:
- Parses your SQL and creates an execution plan
- Decomposes the plan into stages (logical units of work)
- Breaks each stage into steps that can run in parallel
- Assigns slots to execute those steps
- Shuffles intermediate results between stages
- Reassigns slots dynamically as the query progresses
This is why BigQuery can process terabytes in seconds. It’s not one machine working hard; it’s thousands of slots working together.
┌─────────────────────────────────────────────────────────────┐│ Your SQL Query │└─────────────────────────────────────────────────────────────┘ │ ▼┌─────────────────────────────────────────────────────────────┐│ Execution Plan ││ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ││ │ Stage 1 │ -> │ Stage 2 │ -> │ Stage 3 │ -> │ Stage 4 │ ││ │ (Scan) │ │ (Filter)│ │ (Agg) │ │ (Sort) │ ││ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │└─────────────────────────────────────────────────────────────┘ │ ▼┌─────────────────────────────────────────────────────────────┐│ Slots Execute Stages in Parallel ││ [Slot 1] [Slot 2] [Slot 3] ... [Slot N] │└─────────────────────────────────────────────────────────────┘Each stage in the execution plan represents a logical unit of work — a table scan, a filter, an aggregation, a sort. BigQuery breaks each stage into many parallel work units and distributes them across available slots. As slots complete their assigned work, they pick up the next work unit. This dynamic reassignment is key: BigQuery doesn’t statically partition the work at the start. It continuously rebalances as the query progresses.
The shuffle step between stages is where intermediate results move between slots. A scan stage might read data across hundreds of slots, then shuffle results to a smaller set of slots for aggregation. This is the same MapReduce-style pattern that the Dremel engine has used since its inception — just abstracted behind the slot concept.
Slot Contention
When a query stage needs 2,000 slots but only 1,000 are available, BigQuery uses all 1,000 and queues the remaining work units. Slots pick up queued work as they complete. The query finishes, but more slowly.
This is slot contention. BigQuery degrades gracefully — no hard failure, no timeout, no error message — which makes contention easy to miss until it becomes severe. Dashboards that are fast in the morning and slow in the afternoon, or dbt runs that take 20 minutes on weekends and 90 minutes on weekdays, are typical symptoms.
Two Ways to Get Slots
BigQuery offers two fundamental pricing models that determine how you access slots:
On-demand pricing gives you access to a shared pool of approximately 2,000 slots. You pay per terabyte of data scanned, and BigQuery dynamically allocates slots from this shared pool. It’s simple, requires no planning, and works well for unpredictable workloads. The downside: those 2,000 slots are shared with other on-demand users in the same region, so performance can vary based on regional demand.
Capacity-based pricing (via BigQuery Editions) lets you reserve your own dedicated slots. You pay for the slots themselves, regardless of how much data you scan. This gives you predictable performance and — for heavy workloads — often costs less than on-demand. The trade-off is that you’re committing to capacity whether you use it or not (though autoscaling mitigates this).
The BigQuery Cost Model covers the pricing math in detail, including breakeven calculations between the two models. The key insight here is architectural, not financial: on-demand gives you variable performance from a shared pool, while capacity-based gives you dedicated resources with predictable behavior.
A common progression: start on on-demand, monitor slot usage as workloads grow, and migrate to capacity-based pricing when contention becomes a problem or when the cost math favors it. The monitoring tools identify when that point arrives.
More Slots Don’t Always Mean Faster Queries
Additional slots do not always improve performance. Simple queries scanning small data volumes may not benefit from additional parallelism. A query scanning 1 GB may run in 2 seconds with either 100 or 500 slots, where coordination overhead outweighs parallelism benefits.
Partitioning and clustering and avoiding SELECT * often deliver larger gains than adding slots. The most expensive query in a project is typically scanning 10x more data than needed, not simply under-slotted.
A common misconception is that 1,000 slots means 1,000 concurrent queries. Slots are shared across all jobs in a reservation via fair scheduling. 1,000 slots might run 1 query at 1,000 slots, or 100 queries at 10 slots each. Actual concurrency depends on query complexity and fair scheduling dynamics.