GA4 stores most of its analytical value in event_params — a repeated RECORD field containing key-value pairs for every event parameter. Extracting these in SQL requires a correlated subquery for each parameter you want. At scale, writing this inline for 20+ parameters per model becomes unmanageable.
The solution is a dbt macro that encapsulates the extraction pattern.
The Core Macro
-- macros/ga4/extract_event_param.sql
{% macro extract_event_param(params_column, param_key, value_type='string_value') %} ( SELECT value.{{ value_type }} FROM UNNEST({{ params_column }}) WHERE key = '{{ param_key }}' LIMIT 1 ){% endmacro %}This generates a correlated subquery inline — the approach that extracts one parameter value without expanding rows. The LIMIT 1 guard handles the rare case where a parameter appears multiple times in event_params (a known GA4 edge case for some tracking implementations).
Usage in a base model:
{{ extract_event_param('event_params', 'page_location', 'string_value') }} AS page__location,{{ extract_event_param('event_params', 'ga_session_id', 'int_value') }} AS session__ga_id,{{ extract_event_param('event_params', 'session_engaged', 'string_value') }} AS session__is_engagedThe value_type parameter matches the four possible fields in GA4’s parameter value structure:
string_value— text parameters (page URLs, event categories, custom string dimensions)int_value— integer parameters (ga_session_id,ga_session_number,entrances)float_value— decimal valuesdouble_value— high-precision decimals (less common, used for some revenue fields)
Why Correlated Subquery, Not CROSS JOIN UNNEST
Two approaches exist for extracting array elements in BigQuery:
Correlated subquery (used in this macro):
(SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location' LIMIT 1)CROSS JOIN UNNEST:
FROM eventsCROSS JOIN UNNEST(event_params) AS paramWHERE param.key = 'page_location'The CROSS JOIN approach multiplies rows. An event with 15 parameters generates 15 rows in the intermediate result — 14 of which get filtered away. In a base model selecting 20 parameters, this creates enormous intermediate result sets that BigQuery has to sort through.
The correlated subquery keeps exactly one row per event regardless of how many parameters exist. For GA4 data with 10-30 parameters per event, this is a meaningful difference in query efficiency.
BigQuery’s query optimizer can also collapse multiple correlated subqueries into a single pass over event_params — so extracting 20 parameters doesn’t mean 20 separate scans of the array.
The Numeric Variant
GA4 has an inconsistency: some parameters that are conceptually numeric get stored in different value fields depending on context. Revenue might arrive as float_value or double_value. Custom dimensions sometimes switch between int_value and float_value. A robust macro handles all three numeric fields:
{% macro extract_event_param_numeric(params_column, param_key) %} ( SELECT COALESCE( value.int_value, CAST(value.float_value AS INT64), CAST(value.double_value AS INT64) ) FROM UNNEST({{ params_column }}) WHERE key = '{{ param_key }}' LIMIT 1 ){% endmacro %}Use extract_event_param_numeric for parameters where you’re not certain which numeric field GA4 will populate. Use extract_event_param with an explicit value_type when you know exactly what the parameter stores.
The int_value Trap for Session IDs
ga_session_id is stored in value.int_value. A common mistake is extracting it as string_value:
-- WRONG: Returns NULL for ga_session_id{{ extract_event_param('event_params', 'ga_session_id', 'string_value') }}
-- CORRECT: Returns the integer session identifier{{ extract_event_param('event_params', 'ga_session_id', 'int_value') }}Null session IDs cascade into broken session key construction, which breaks every window function partitioned by session key downstream. Always verify which value field a parameter uses against the GA4 schema reference.
Variable-Driven Extraction for Custom Parameters
For projects with many custom event parameters, the Velir approach of variable-driven extraction scales better than hardcoding each parameter. A loop over a configured list:
vars: ga4: custom_parameters: - name: "user_type" value_type: "string_value" - name: "experiment_id" value_type: "string_value" - name: "product_category" value_type: "string_value"{% for param in var('ga4:custom_parameters', []) %} {{ extract_event_param('event_params', param.name, param.value_type) }} AS custom__{{ param.name }},{% endfor %}This makes adding new custom parameters a config change rather than a SQL change — useful for properties that evolve their tracking over time.
Organizing the Macros
Keep GA4-specific macros in their own subfolder:
macros/└── ga4/ ├── extract_event_param.sql ├── extract_event_param_numeric.sql └── default_channel_grouping.sqlThe subfolder keeps macro namespacing clear: {{ extract_event_param() }} is obviously a GA4-domain macro. For cross-project reuse, these macros could live in a private dbt package — especially if you maintain multiple GA4 properties across client projects.