ServicesAboutNotesContact Get in touch →
EN FR
Note

Consent Mode Impact on Identity Resolution

How GA4 Consent Mode V2 changes what identity data reaches BigQuery — cookieless pings without identifiers, the same-page backstitch nuance, and filtering consented data for stitching pipelines.

Planted
ga4bigqueryanalyticsdata quality

Consent Mode V2 changes the shape of data in the BigQuery export in ways that directly affect identity stitching. The behavior varies by implementation mode. Events from non-consented users in Advanced mode arrive in BigQuery without identifiers and cannot be stitched.

Basic vs Advanced mode in BigQuery

Basic Consent Mode is straightforward: when analytics_storage is denied, no data is collected and nothing reaches BigQuery. The user is completely invisible. Your stitching pipeline only sees consented events, and they look exactly like pre-consent-mode data.

Advanced Consent Mode is where it gets complicated. When analytics_storage is denied under Advanced mode, GA4 sends “cookieless pings” to BigQuery — events that exist in the export but carry no identifying information:

-- What cookieless pings look like in BigQuery
SELECT
event_name,
user_pseudo_id, -- NULL
user_id, -- NULL
privacy_info.analytics_storage -- 'No'
FROM `project.analytics_XXXXX.events_*`
WHERE privacy_info.analytics_storage = 'No'
LIMIT 5

These events have:

  • NULL for user_pseudo_id
  • No ga_session_id in event_params
  • privacy_info.analytics_storage = 'No'

They cannot be stitched or sessionized. There’s no identifier to join on, no session to assign them to. They’re useful for Google’s behavioral modeling in the GA4 interface, but for your warehouse pipeline they’re noise that must be filtered before reaching your stitching layer.

Before assuming all your events are stitchable, check how your data is actually distributed:

SELECT
privacy_info.analytics_storage AS consent_status,
COUNT(*) AS events,
COUNT(DISTINCT user_pseudo_id) AS unique_devices,
ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER(), 2) AS pct_of_total
FROM `project.analytics_XXXXX.events_*`
WHERE _table_suffix >= FORMAT_DATE('%Y%m%d', CURRENT_DATE() - 30)
GROUP BY privacy_info.analytics_storage
ORDER BY events DESC

Three possible values for analytics_storage:

  • 'Yes' — consented, full identifiers present
  • 'No' — denied, cookieless ping (if Advanced mode)
  • NULL — collected before Consent Mode was implemented; treat as consented

A site with Advanced Consent Mode and 40% consent rejection in France will show roughly 40% of events with analytics_storage = 'No'. Those events will contribute to your total event count but cannot participate in identity stitching.

The filter for clean stitching

For your identity stitching pipeline, filter to consented events at the earliest possible model:

WHERE privacy_info.analytics_storage = 'Yes'
OR privacy_info.analytics_storage IS NULL -- Pre-consent-mode historical data

This filter belongs in your base GA4 events model, not in your stitching logic. Letting non-consented events flow into the identity mapping table and then filtering there is more expensive and more error-prone than removing them at the source.

The practical consequence: your BigQuery user counts will be lower than GA4 interface numbers. GA4’s interface applies behavioral modeling to estimate what denied-consent users did; that modeled data never reaches BigQuery. Don’t try to reconcile the gap — it’s architectural. See GA4 BigQuery Number Discrepancies for how to frame this for stakeholders.

The same-page backstitch nuance

There’s an edge case worth knowing about, even if it rarely causes problems in practice.

If a user denies consent on page load and then grants consent on that same page (clicking “Accept” after initial dismissal), GA4 retroactively applies the Client ID to the previously denied hits from that page. Those events transition from cookieless pings to identified events in the daily tables.

This same-page backstitching does not work across page loads. Once a user navigates away with denied consent, those events stay permanently anonymous. The retroactive correction only applies within a single page view.

In practice, this means your intraday tables may show inconsistencies that resolve in the daily tables. If you’re doing real-time or intraday analysis on stitching, you may see events that appear anonymous in the intraday data become identified in the daily data later that day. For batch pipelines running against daily tables with a 3-day lookback window, this is handled automatically by the reprocessing window.

Privacy implications of filtering

Filtering events where analytics_storage = 'No' from your stitching pipeline isn’t just technically correct — it’s legally required in most jurisdictions. A user who denied consent to analytics cookies did not consent to having their browsing behavior linked to their identity. Attempting to stitch non-consented events (e.g., using behavioral patterns to match them probabilistically to known users) is a consent violation.

The consent architecture should make this filter structural, not optional. Non-consented events should branch to a separate anonymous aggregate path before they ever touch the identity layer:

base__ga4__events
├── consented events → int__ga4__events_stitched → identity-linked marts
└── non-consented events → int__ga4__anonymous_aggregates → aggregate-only marts

This makes the consent boundary visible in your dbt DAG and prevents future developers from accidentally routing non-consented events into identity resolution.