ServicesAboutNotesContact Get in touch →
EN FR
Note

Secured Table Materialization in dbt

A custom dbt materialization that automatically reapplies BigQuery row access policies, column descriptions, and data masking tags after every table rebuild.

Planted
dbtbigquerydata engineeringdata quality

BigQuery supports row access policies that restrict which rows different user groups can see. When dbt drops and recreates a table — as the standard table materialization does — every row access policy on that table disappears. You need to reapply them after every build.

dbt’s default behavior (drop + recreate) conflicts with BigQuery’s security model: policies are attached to specific table objects and are lost on every rebuild. Every time dbt replaces the table, the security configuration resets to zero.

Why Post-Hooks Don’t Scale

For a single model with one policy, post-hooks work fine:

{{ config(
materialized='table',
post_hook="CREATE OR REPLACE ROW ACCESS POLICY region_filter ON {{ this }} GRANT TO (\"group:eu-analysts@company.com\") FILTER USING (region = 'EU')"
) }}

This stops scaling at around three policies per model. The config block becomes unreadable, the quoting gets fragile (nested double quotes inside SQL strings inside Jinja), and applying the same pattern across multiple models means copying walls of post-hook text. If a security requirement changes — say you add a new analyst group — you’re hunting through thirty models to update post-hooks.

A custom materialization handles this declaratively: policies are defined in config and applied automatically after each table build. Adding a policy is one dict entry in the model config rather than a manually escaped SQL string.

The Pattern

Create macros/materializations/secured_table.sql:

{%- materialization secured_table, adapter='bigquery' -%}
{# Setup relations #}
{% set target_relation = this.incorporate(type='table') %}
{% set existing_relation = load_cached_relation(this) %}
{{ run_hooks(pre_hooks) }}
{# Drop and rebuild #}
{% if existing_relation is not none %}
{% do adapter.drop_relation(existing_relation) %}
{% endif %}
{% call statement('main') %}
{{ create_table_as(false, target_relation, sql) }}
{% endcall %}
{# Apply column descriptions from config #}
{% set column_descriptions = config.get('column_descriptions', {}) %}
{% for col_name, description in column_descriptions.items() %}
{% call statement('describe_' ~ col_name) %}
ALTER TABLE {{ target_relation }}
ALTER COLUMN {{ col_name }}
SET OPTIONS(description='{{ description | replace("'", "\\'") }}')
{% endcall %}
{% endfor %}
{# Apply row access policies from config #}
{% set row_access_policies = config.get('row_access_policies', []) %}
{% for policy in row_access_policies %}
{% call statement('policy_' ~ policy.name) %}
CREATE OR REPLACE ROW ACCESS POLICY {{ policy.name }}
ON {{ target_relation }}
GRANT TO ({{ policy.grantees | join(', ') }})
FILTER USING ({{ policy.filter_expression }})
{% endcall %}
{% endfor %}
{{ run_hooks(post_hooks) }}
{% set grant_config = config.get('grants') %}
{% do apply_grants(target_relation, grant_config) %}
{% do persist_docs(target_relation, model) %}
{% do adapter.commit() %}
{{ return({'relations': [target_relation]}) }}
{%- endmaterialization -%}

Using It in a Model

Security policies are defined in the config block as data:

{{ config(
materialized='secured_table',
column_descriptions={
'user_id': 'Unique identifier from the auth system',
'email': 'User email address, restricted to PII-authorized groups',
'region': 'Geographic region code used for access filtering',
'lifetime_value': 'Total revenue attributed to this user in EUR'
},
row_access_policies=[
{
'name': 'eu_analysts_filter',
'grantees': ['"group:eu-analysts@company.com"'],
'filter_expression': "region = 'EU'"
},
{
'name': 'us_analysts_filter',
'grantees': ['"group:us-analysts@company.com"'],
'filter_expression': "region = 'US'"
},
{
'name': 'admin_full_access',
'grantees': ['"group:data-admins@company.com"'],
'filter_expression': 'TRUE'
}
]
) }}
SELECT
user_id,
email,
region,
lifetime_value
FROM {{ ref('int__users_enriched') }}

Every time this model runs, it rebuilds with the column descriptions and row access policies applied automatically.

Column Descriptions: Two Valid Approaches

Column descriptions in this materialization have an alternative path: dbt’s persist_docs with columns: true reads descriptions from schema YAML files and applies them via the adapter. This works well if your team already maintains schema YAML.

The materialization approach keeps descriptions in the model config alongside the SQL, which some teams prefer — everything about the table’s definition lives in one file. Both approaches are valid. The materialization approach shines when:

  • Column descriptions are tightly coupled to security concerns (documenting what’s restricted and why)
  • You want the security config and documentation in the same place for auditability
  • Your team doesn’t maintain comprehensive schema YAML files

If you already have thorough schema.yml files with column descriptions, don’t duplicate them in the materialization config. Use persist_docs for descriptions and the materialization only for security policies.

Extending with Data Masking

BigQuery’s column-level security uses policy tags from Data Catalog. The pattern is similar to row access policies, but the DDL assigns policy tags to columns rather than creating standalone policies:

{# Add this block after row access policies in the materialization #}
{% set column_policy_tags = config.get('column_policy_tags', {}) %}
{% for col_name, tag_path in column_policy_tags.items() %}
{% call statement('tag_' ~ col_name) %}
ALTER TABLE {{ target_relation }}
ALTER COLUMN {{ col_name }}
SET OPTIONS(policy_tags=['{{ tag_path }}'])
{% endcall %}
{% endfor %}

Then in model config:

{{ config(
materialized='secured_table',
column_policy_tags={
'email': 'projects/my-project/locations/us/taxonomies/123/policyTags/456',
'phone': 'projects/my-project/locations/us/taxonomies/123/policyTags/789'
}
) }}

The taxonomy and policy tags need to exist in Data Catalog before the materialization references them — create those as part of your infrastructure setup (Terraform, gcloud CLI, etc.). The materialization handles only the column-level assignment on each rebuild.

A single materialization can cover row access policies, column descriptions, and data masking tags. The IAM layer controls dataset and query access; this materialization controls what is visible within a table.

Combining with Zero-Downtime

The secured table pattern shown here uses simple drop-and-recreate. For production tables that need both security and availability guarantees, you can combine this with the zero-downtime swap pattern. Build to a temp name, validate, swap, then apply security policies to the final table. The key difference: row access policies must be applied after the swap, to the table that now occupies the target name, not to the temp table that got renamed.

This combined pattern is more complex to maintain, so only use it when you genuinely need both capabilities. Most secured tables are analytical models where a brief rebuild gap is acceptable.