GA4’s interface applies user resolution logic that BigQuery never sees. This is one source of the numeric gap between GA4 and BigQuery, alongside HyperLogLog estimation and consent modeling.
The three modes
GA4’s interface offers three “Reporting Identity” settings, found in Admin → Property Settings:
| Mode | Resolution order |
|---|---|
| Blended | User ID → Device ID → Modeling |
| Observed | User ID → Device ID |
| Device-based | Device ID only |
Blended is the default for most properties. It attempts to use your user_id when available, falls back to user_pseudo_id when not, and fills remaining gaps with behavioral modeling — estimating what anonymous users likely did based on patterns from identified users. This is the mode that produces the widest gap with BigQuery.
Observed uses deterministic signals only (no modeling), but still applies GA4’s cross-device resolution when a user is signed into Google across devices via Google Signals. If the same Google-signed-in person browses on their phone and laptop, GA4 sees them as one user. BigQuery doesn’t.
Device-based is the most honest mode relative to BigQuery — it uses only user_pseudo_id, no cross-device logic, no modeling. The gap with BigQuery narrows considerably, though HLL estimation and consent mode still contribute discrepancies.
Why none of this reaches BigQuery
The BigQuery export is a raw event-level stream. GA4 exports whatever identifiers were present at collection time: the user_pseudo_id cookie value and the user_id you sent (or null if you didn’t). No resolution logic is applied during export.
This means:
- Behavioral modeling data never appears in BigQuery. The users GA4’s Blended mode estimated don’t show up as rows. They exist only in the interface as modeled additions to your metrics.
- Google Signals cross-device deduplication doesn’t apply. In BigQuery, the same person on two devices appears as two separate
user_pseudo_idvalues. There’s no connection between them unless they authenticate in your app and you captureuser_idon both devices. - GA4’s own identity stitching doesn’t carry over. Even if GA4’s interface has resolved that a given
user_pseudo_idbelongs to a knownuser_id, that resolution isn’t reflected in the export rows. Each event carries only the identifiers that were present when the event was collected.
The practical consequence
When you compare user counts between GA4 and BigQuery, you’re comparing two different things. GA4 Blended mode reports “our best estimate of unique people including modeled users.” BigQuery reports “distinct device identifiers from events where analytics tracking was enabled.”
BigQuery is a source of truth for auditable, reproducible analysis. It shows less than the interface by design: the interface includes modeled users and cross-device deduplication that the export does not carry. Cross-session and cross-device analysis requires building a separate identity stitching layer in the warehouse.
What the reporting identity setting tells you
The setting does give you one useful signal: if a property is set to Device-based, someone made a deliberate choice not to use user_id at all, or doesn’t have it implemented. The interface and BigQuery will agree more closely, but you’re also not getting any benefit from authenticated identity signals.
If it’s set to Blended and you’re seeing large GA4 vs BigQuery gaps on European properties, consent rejection is likely the dominant cause — GA4 is modeling the non-consenting population, and those estimates never hit BigQuery. See Consent Mode Basic vs Advanced for how the modeling works and GA4 BigQuery Number Discrepancies for how to frame this for stakeholders.
The reporting identity mode is a UI concern. Building reliable user-level analytics in the warehouse requires building resolution logic from the raw identifiers the export provides, not from what GA4 resolved in the interface.