ServicesAboutNotesContact Get in touch →
EN FR
Note

IAM Debt Audit for GCP Data Platforms

Bash and SQL queries to surface Editor roles, service accounts with keys, and shared credentials — the starting point for any GCP IAM cleanup.

Planted
gcpbigquerydata engineering

Data platforms accumulate IAM debt over time: Editor roles granted for convenience, shared service accounts, service account keys committed to repositories. This note provides bash and SQL queries to surface the most common forms of that debt — a baseline before tightening permissions.

Find Over-Privileged Principals

List all principals with Editor or Owner roles at the project level:

Terminal window
gcloud projects get-iam-policy YOUR_PROJECT_ID \
--flatten="bindings[].members" \
--format="table(bindings.role,bindings.members)" \
--filter="bindings.role:(roles/editor OR roles/owner)"

This single command surfaces the highest-risk permissions in your project. Editor and Owner are primitive roles that predate IAM’s predefined roles — they grant broad access across every GCP service, not just BigQuery. Anyone who appears here because they “needed BigQuery access” is a cleanup candidate. The right fix is downgrading to the appropriate predefined role (roles/bigquery.dataViewer, roles/bigquery.dataEditor) at the appropriate scope (dataset, not project).

Don’t expect to fix every result immediately. Prioritize service accounts first — they’re often automated workloads that won’t notice the change. Human users with Editor “because IT set it up that way” take more coordination.

Find Service Accounts with Keys

Service account keys are downloadable credentials that can be copied, committed to version control, and used from anywhere. They’re the highest-risk artifact in a typical GCP setup.

Terminal window
gcloud iam service-accounts list --project=YOUR_PROJECT_ID \
--format="value(email)" | while read sa; do
keys=$(gcloud iam service-accounts keys list --iam-account="$sa" \
--format="value(name)" --filter="keyType=USER_MANAGED" 2>/dev/null)
if [ -n "$keys" ]; then
echo "Service account with keys: $sa"
fi
done

The --filter="keyType=USER_MANAGED" excludes Google-managed keys (system keys that Google rotates automatically). You’re looking for keys that humans created and downloaded — those are the ones that can be leaked.

For each result, the remediation path is:

  1. Determine what’s using the key (check deployment configs, CI/CD systems, local developer machines)
  2. Migrate to keyless authentication: Workload Identity Federation for CI/CD for CI systems, service account impersonation for local development
  3. Rotate or delete the key after confirming nothing depends on it

Detect Shared Service Accounts

The shared service account problem is harder to see from IAM policies alone. A single etl-service-account might appear appropriately scoped, but be used by a dozen different workloads. The evidence shows up in BigQuery job history:

SELECT
user_email,
COUNT(DISTINCT job_id) AS job_count,
COUNT(DISTINCT REGEXP_EXTRACT(query, r'FROM `([^`]+)`')) AS tables_accessed
FROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT
WHERE creation_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
AND user_email LIKE '%.iam.gserviceaccount.com'
GROUP BY user_email
ORDER BY job_count DESC

A service account running hundreds of different queries against dozens of tables is almost certainly shared across workloads. High job count plus high table count is the signature pattern. That’s your first remediation target — not because it’s the easiest fix, but because shared service accounts are where blast radius accumulates. When one workload’s permissions need to change, you can’t change them without affecting everything else sharing the account.

Read the Results as a Prioritized Backlog

The three queries together give you a risk-ordered remediation backlog:

FindingRiskEffortFix
Editor/Owner on service accountCriticalLowDowngrade to predefined role
Service account key existsHighMediumMigrate to keyless auth
Editor/Owner on human userMediumHighCoordinate with user, downgrade
Shared service accountMediumHighSplit into [[Per-Workload Service Account Naming Conventions

Start with service accounts. They’re automated systems — you can update the permissions and see immediately whether anything breaks. Human users require coordination and usually some lead time.

The 2-Layer RBAC with Google Groups pattern is the destination state for human user permissions. The Per-Workload Service Account Naming Conventions pattern is the destination state for service accounts. Run the audit first to understand the gap, then work toward those patterns one workload at a time.