BigQuery’s INFORMATION_SCHEMA views expose query costs, user spend, table partitioning status, and cost trends over time. This note covers the key queries for cost attribution and table optimization review.
Finding Your Cost Culprits
Before optimizing anything, identify where money actually goes. This query surfaces your top 10 most expensive queries from the past 30 days:
SELECT user_email, query, total_bytes_processed, ROUND(total_bytes_processed / POW(1024, 4), 2) AS tib_processed, ROUND(total_bytes_processed / POW(1024, 4) * 6.25, 2) AS estimated_cost_usd, creation_time, total_slot_msFROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECTWHERE creation_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY) AND job_type = 'QUERY' AND state = 'DONE' AND total_bytes_processed > 0ORDER BY total_bytes_processed DESCLIMIT 10;In most projects, 3-5 queries or users account for 70%+ of compute spend. Optimize these before addressing the long tail.
Cost Attribution by User
For ongoing cost management, aggregate by user or service account:
SELECT user_email, COUNT(*) AS query_count, ROUND(SUM(total_bytes_processed) / POW(1024, 4), 2) AS total_tib, ROUND(SUM(total_bytes_processed) / POW(1024, 4) * 6.25, 2) AS total_cost_usdFROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECTWHERE creation_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY) AND job_type = 'QUERY' AND state = 'DONE'ORDER BY total_cost_usd DESCLIMIT 20;This immediately tells you whether costs are driven by dbt service accounts (pipeline costs, optimizable through materializations and incrementals), BI tool service accounts (dashboard costs, optimizable through partitioning and materialized views), or human users (ad-hoc costs, addressable through governance controls).
Cost Attribution by dbt Model
If your dbt project is configured with query comments and job labels, you can attribute costs to individual models:
SELECT JSON_EXTRACT_SCALAR(query, '$.dbt_model') AS dbt_model, COUNT(*) AS runs, ROUND(SUM(total_bytes_processed) / POW(1024, 4), 4) AS total_tib, ROUND(SUM(total_bytes_processed) / POW(1024, 4) * 6.25, 2) AS total_cost_usdFROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECTWHERE creation_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY) AND query LIKE '%dbt_model%'GROUP BY 1ORDER BY total_cost_usd DESC;This query reveals which models consume the most resources. This reveals which models consume the most resources.
For more comprehensive dbt cost monitoring, the dbt-bigquery-monitoring community package automates this pattern across projects and regions.
Cost Trend by Dataset
Track how costs evolve over time to catch regressions early:
SELECT DATE(creation_time) AS date, referenced_table.dataset_id, ROUND(SUM(total_bytes_processed) / POW(1024, 4), 2) AS tib_scanned, ROUND(SUM(total_bytes_processed) / POW(1024, 4) * 6.25, 2) AS cost_usdFROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT,UNNEST(referenced_tables) AS referenced_tableWHERE creation_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY) AND job_type = 'QUERY'GROUP BY 1, 2ORDER BY date DESC, cost_usd DESC;A sudden spike in one dataset’s cost often means someone deployed a new model without partitioning, changed a materialization from incremental to full-refresh, or added an expensive new query. Catching this within a day is the difference between a $50 mistake and a $1,500 month.
Finding Unoptimized Tables
Large tables without partitioning or clustering are optimization candidates hiding in plain sight:
SELECT table_schema, table_name, ROUND(total_logical_bytes / POW(1024, 3), 2) AS size_gb, CASE WHEN partition_column IS NULL THEN 'Missing' ELSE 'OK' END AS partitioning, CASE WHEN clustering_columns IS NULL THEN 'Missing' ELSE 'OK' END AS clusteringFROM `project.region-us`.INFORMATION_SCHEMA.TABLE_STORAGELEFT JOIN `project.region-us`.INFORMATION_SCHEMA.PARTITIONS USING (table_schema, table_name)LEFT JOIN ( SELECT table_schema, table_name, STRING_AGG(clustering_ordinal_position) AS clustering_columns FROM `project.region-us`.INFORMATION_SCHEMA.COLUMNS WHERE clustering_ordinal_position IS NOT NULL GROUP BY 1, 2) USING (table_schema, table_name)WHERE total_logical_bytes > 107374182400 -- > 100 GBORDER BY total_logical_bytes DESC;Any table over 100 GB showing “Missing” for partitioning is likely costing you significant money on every query. Cross-reference with the top expensive queries — if an expensive query hits an unpartitioned table, you’ve found a quick win.
Using Google’s Built-In Recommendations
BigQuery’s recommender automatically suggests partitioning and clustering for tables that would benefit. It analyzes query patterns and suggests columns based on actual filter usage, not guesses. Access recommendations through:
- Cloud Console: BigQuery > Table details > Recommendations tab
INFORMATION_SCHEMA.TABLE_OPTIONSfor programmatic access- Recommender API for automated processing
The recommender only flags tables above 100 GB for partitioning, which aligns with Google’s own threshold for when partitioning overhead is justified.
Building a Weekly Review Practice
The queries above are most valuable when run regularly. A weekly cost review takes 15 minutes and catches problems before they compound into surprise bills.
Weekly checklist:
- Top 10 expensive queries: Any new additions? Any that could be optimized with better filters?
- Cost by user/service account: Any unexpected spikes? New service accounts appearing?
- Cost trend by dataset: Any datasets trending upward? Any regressions from recent deployments?
- Unoptimized tables: Any new large tables created without partitioning?
Key metrics for a cost dashboard:
- Daily compute spend: Catch anomalies before they compound
- Bytes scanned trend: Leading indicator of cost changes
- Slot utilization (if using reservations): Identify under/over provisioning
- Cost per query average: Track optimization progress over time
- Top cost contributors: Users, service accounts, datasets, models
Alert thresholds:
- 50% of daily budget: Warning notification
- 80% of daily budget: Critical notification to team lead
- 100%: Auto-triggered investigation process
Weekly review catches cost regressions close to deployment, when the cause is easier to identify and fix.