The event_params array in GA4’s BigQuery export stores each parameter as a key-value pair where the value is a RECORD with four possible fields: string_value, int_value, float_value, and double_value. GA4 auto-detects which field to populate based on the data type of the value sent.
This design keeps the schema flexible — you can add new parameters without altering table structure — but it introduces a class of subtle bugs that are easy to miss because queries don’t error out. They just return nulls.
How Type Assignment Works
GA4 assigns values to type fields as follows:
| Value Field | Type | Used For |
|---|---|---|
string_value | STRING | Text values, URLs, categories, identifiers |
int_value | INTEGER | Counts, IDs, Unix timestamps, boolean flags (0/1) |
float_value | FLOAT | Currently unused by GA4 |
double_value | FLOAT | Decimal values, prices |
The auto-detection is consistent for built-in GA4 parameters. You can rely on:
ga_session_id→ alwaysint_value(Unix timestamp)ga_session_number→ alwaysint_valuepage_location→ alwaysstring_valuepage_title→ alwaysstring_valuepage_referrer→ alwaysstring_valueengaged_session_event→ alwaysint_value(0 or 1)
Extracting these with the wrong type field returns null silently:
-- WRONG: Returns NULL for session ID(SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')
-- CORRECT: Returns the integer session timestamp(SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')The Custom Parameter Problem
Custom parameters — dimensions you instrument yourself — can land in different type fields depending on how the values were collected. The problem compounds with implementation inconsistencies over time:
- A parameter sent as a string
"123"goes tostring_value - The same parameter sent as the integer
123goes toint_value - If both happen across different events or over time, you have values split across two type fields
This happens in practice. A developer changes a GA4 tag implementation to send a previously-string parameter as a number. Historical events have the value in string_value. New events have it in int_value. A query that only reads from one field gets half the data.
The COALESCE Defensive Pattern
When you’re uncertain about a parameter’s type — or when type assignment has been inconsistent — use COALESCE across all relevant type fields:
SELECT COALESCE( value.string_value, CAST(value.int_value AS STRING), CAST(value.double_value AS STRING) ) AS param_valueFROM UNNEST(event_params)WHERE key = 'custom_param'This returns the first non-null value across the three fields, casting numeric types to string for a consistent output type. It’s defensive but not free — you’re evaluating three expressions instead of one. For high-volume queries over large date ranges, this overhead accumulates.
A more targeted approach, once you’ve audited which types are actually in play:
-- If you know the param is either string or int, never double:COALESCE( value.string_value, CAST(value.int_value AS STRING)) AS param_valueAuditing Type Distribution
Before building production models that extract a custom parameter, audit how values are actually distributed across type fields:
SELECT COUNTIF(value.string_value IS NOT NULL) AS string_count, COUNTIF(value.int_value IS NOT NULL) AS int_count, COUNTIF(value.double_value IS NOT NULL) AS double_count, COUNTIF(value.float_value IS NOT NULL) AS float_count, COUNT(*) AS totalFROM `project.analytics_123456789.events_*`, UNNEST(event_params)WHERE key = 'your_custom_param' AND _TABLE_SUFFIX BETWEEN '20260101' AND '20260131'If string_count is 95,000 and int_count is 5,000, you know you need COALESCE. If only int_count is non-zero, you can extract directly from int_value with confidence.
Run this audit across a representative date range — not just recent data — especially if your GA4 implementation has been active for years and may have evolved.
Encoding in dbt Macros
When you’re building reusable extraction macros, encode the type assumption explicitly so future maintainers understand the intent:
-- macros/ga4_param_string.sql{% macro ga4_param_string(column, param_name) %}(SELECT value.string_value FROM UNNEST({{ column }}) WHERE key = '{{ param_name }}'){% endmacro %}
-- macros/ga4_param_int.sql{% macro ga4_param_int(column, param_name) %}(SELECT value.int_value FROM UNNEST({{ column }}) WHERE key = '{{ param_name }}'){% endmacro %}
-- macros/ga4_param_coerce.sql-- Use when type is uncertain or historically inconsistent{% macro ga4_param_coerce(column, param_name) %}COALESCE( (SELECT value.string_value FROM UNNEST({{ column }}) WHERE key = '{{ param_name }}'), CAST((SELECT value.int_value FROM UNNEST({{ column }}) WHERE key = '{{ param_name }}') AS STRING), CAST((SELECT value.double_value FROM UNNEST({{ column }}) WHERE key = '{{ param_name }}') AS STRING)){% endmacro %}The macro name signals the assumption. ga4_param_string tells you string is expected. ga4_param_coerce tells you type is uncertain. Documentation in the macro name is better than a comment that might drift from reality.
Relationship to user_properties
The same type detection logic applies to the user_properties array. User properties are set via setUserProperty and stored in an identical nested structure: each entry has a key and a value RECORD with string_value, int_value, float_value, and double_value. The same auditing and defensive extraction patterns apply.
The one difference with user_properties is the additional set_timestamp_micros field, which records when the property was last updated — useful for tracking property drift over time.