BigQuery Fine-Grained Access Control

BigQuery IAM roles control whether someone can access a table at all. Fine-grained access control goes deeper — controlling which columns they can see, which rows are returned, and what sensitive values look like when they do have access. These three mechanisms — column-level security, row-level security, and dynamic data masking — address the real-world requirement that most teams have: different users need different views of the same underlying data.

The 2-Layer RBAC Foundation

Before reaching for column or row-level controls, establish a clear group-based role structure. Predefined IAM roles (Data Viewer, Data Editor, Data Owner) form the object access layer. Google Groups representing functional roles (commonly something like loader-role, engineer-role, analyst-role) compose these IAM roles for specific job functions.

The critical operational habit: add users to groups rather than managing individual role bindings. A Google Group called data-analysts@company.com gets bigquery.dataViewer on the analytics datasets. When an analyst joins, you add them to the group. When they leave, you remove them. You never touch IAM directly.

This structure gives you an auditable access control system without the maintenance overhead of per-user bindings. It also makes incident response faster: “remove all analyst access immediately” is a single IAM binding change, not hunting through hundreds of individual permissions.

Column-Level Security with Policy Tags

Column-level security uses Data Catalog policy tags to restrict which users can see specific columns. The setup involves three components: a tag taxonomy, column annotations in BigQuery schemas, and IAM grants on the taxonomy.

Setting Up a Policy Tag Taxonomy

Organize tags hierarchically from broad to specific:

PII
├── High_Sensitivity
│   ├── SSN
│   ├── Credit_Card_Number
│   └── Bank_Account
└── Medium_Sensitivity
    ├── Email_Address
    ├── Phone_Number
    └── Date_of_Birth

# Create the taxonomy
gcloud data-catalog taxonomies create \
    --location=us \
    --display-name="Data Sensitivity" \
    --description="PII classification hierarchy for BigQuery column security"

# Create policy tags within the taxonomy (done through Data Catalog UI or API)

Enable access control enforcement on the taxonomy — this is what makes the tags actually enforce restrictions rather than just label columns:

gcloud data-catalog taxonomies set-iam-policy TAXONOMY_ID \
    --location=us \
    --member="group:data-analysts@company.com" \
    --role="roles/datacatalog.categoryViewer"

Tagging Columns in BigQuery

Apply policy tags directly in CREATE TABLE statements or ALTER TABLE:

-- Tag sensitive columns at table creation
CREATE TABLE `project.dataset.customers` (
    customer_id STRING,
    email STRING OPTIONS (policy_tags='["projects/my-project/locations/us/taxonomies/123/policyTags/456"]'),
    phone STRING OPTIONS (policy_tags='["projects/my-project/locations/us/taxonomies/123/policyTags/456"]'),
    date_of_birth DATE OPTIONS (policy_tags='["projects/my-project/locations/us/taxonomies/123/policyTags/789"]'),
    country STRING,
    signup_date DATE
);

Once a policy tag is applied and access control is enabled on the taxonomy, users without roles/datacatalog.categoryFineGrainedReader on that specific tag receive an error when their query touches the column:

Access Denied: BigQuery BigQuery: User does not have permission to access policy tag
"projects/my-project/locations/us/taxonomies/123/policyTags/456" on column
project.dataset.customers.email

Tagging at the right level matters: tag at the highest logical level appropriate (the PII category, not individual SSN or Email_Address sub-tags) unless different sub-categories genuinely have different access populations. Managing one policy tag grant covers many columns; managing ten individual grants creates overhead without additional security.

Row-Level Security with Row Access Policies

Row Access Policies filter query results based on user identity. The same SELECT query returns different rows depending on who runs it — without any application-layer logic, without separate tables per user segment.

-- Create a row access policy: analysts see only their assigned region's data
CREATE ROW ACCESS POLICY analyst_region_filter
ON `project.dataset.sales`
GRANT TO ("group:data-analysts@company.com")
FILTER USING (region = SESSION_USER_ATTRIBUTE('region'));

-- Or simpler: filter based on session user email domain
CREATE ROW ACCESS POLICY region_east_policy
ON `project.dataset.sales`
GRANT TO ("group:east-analysts@company.com")
FILTER USING (region = 'east');

-- Admins see everything (no policy restricts them)
-- Multiple policies can apply to the same table -- user sees the union of all matching rows

The SESSION_USER() function enables dynamic filtering without hardcoding user identities into policies:

-- Dynamic row filtering: each analyst sees only records where owner_email matches their identity
CREATE ROW ACCESS POLICY owner_only_access
ON `project.dataset.campaign_data`
GRANT TO ("group:campaign-managers@company.com")
FILTER USING (owner_email = SESSION_USER());

Row Access Policies apply transparently. A user running SELECT * FROM campaign_data sees only the rows that pass their policy’s filter expression. They can’t tell whether rows were filtered (they just see fewer rows) unless they know the policy exists.

Multiple policies compose as a union: if a user matches two policies, they see the union of both result sets. This means you can layer policies without worrying about one policy canceling another.

One important caveat: Row Access Policies don’t prevent a determined user from inferring the existence of filtered rows through aggregations. SELECT COUNT(*) FROM campaign_data will return different numbers for different users — which may itself reveal information. Consider whether that inference risk matters for your use case.

Dynamic Data Masking

Dynamic data masking extends column-level security with a more nuanced response: instead of denying access to sensitive columns entirely, it shows obscured data to users with limited permissions. Analysts can see that an email column exists and has values, without seeing the actual email addresses.

Three masking approaches:

Default masking — returns the column’s default value (0 for numbers, "" for strings, null for nullable columns):

-- Users with maskedReader on the policy tag see "" instead of the actual email
-- Users with fineGrainedReader see "user@example.com"

Hash masking — returns a SHA256 hash of the value. Useful when you want analysts to track individuals across tables (matching on a consistent hash) without exposing the underlying PII:

-- Hash masking preserves join-ability without exposing values
-- The same email always produces the same hash -- analysts can join tables on it

Nullification — returns NULL regardless of the actual value. Strictest option for fields that should never be visible to certain users even in obfuscated form.

Configure masking rules in Data Catalog after setting up the policy tag structure:

# Grant maskedReader instead of fineGrainedReader to the analyst group
gcloud data-catalog taxonomies policy-tags set-iam-policy POLICY_TAG_ID \
    --location=us \
    --member="group:data-analysts@company.com" \
    --role="roles/datacatalog.categoryMaskedReader"

The practical value: analytics teams can work with data structure and relationships without accessing sensitive values. A data scientist building a churn model can see that customers have email addresses, phone numbers, and billing addresses (and use a consistent hash to track individuals across tables) without the model training data containing actual PII. This enables analytics work that strict column-level denial would block.

Combining the Three Layers

These mechanisms complement each other. A realistic access control configuration for a customer analytics dataset:

User Group	Column Access	Row Access	Data Masking
Data Engineers	All columns	All rows	None (full access)
Data Analysts	Blocked on SSN, CC#	Their region only	Hashed email/phone
BI Tool Service Account	Blocked on all PII	All rows	None (aggregated anyway)
Compliance Team	All columns	All rows	None (full access)

The IAM role structure ([[BigQuery IAM Patterns|jobUser + dataViewer]]) determines whether they can query the table at all. Column-level security with policy tags determines which columns are visible. Row Access Policies determine which rows appear. Dynamic masking determines what sensitive values look like.

None of these require changes to your dbt models or application SQL. They operate at the BigQuery layer — your queries are the same, the access control applies transparently.

The Anti-Patterns That Create Security Debt

Applying policy tags at project level rather than policy level — this affects all tables in a project, making it impossible to grant different column access to different datasets within the same project.

Using the default Compute Engine service account for production workloads — this account’s permissions can’t be reduced below project-level defaults. Any compromise of a workload using this account exposes more than intended. Always create dedicated service accounts for production jobs.

Granting bigquery.dataViewer without bigquery.jobUser — users can see tables in the schema browser but can’t actually run queries. This creates confusion and support tickets without providing any security value. The pair always go together.

Managing individual IAM bindings instead of groups — when someone leaves the company, you’re hunting through IAM bindings across multiple projects to remove their access. Group-based management means removing them from the group removes their access everywhere the group is used.

The main setup investment is designing the policy tag taxonomy and establishing the group structure. Ongoing maintenance is minimal once those foundations are in place.