ServicesAboutNotesContact Get in touch →
EN FR
Note

GA4 Event Ordering with Batch Fields

How to use batch_event_index, batch_ordering_id, and batch_page_id for deterministic event sequencing in GA4 BigQuery exports.

Planted
ga4bigqueryanalyticsdata modeling

Understanding the order in which events occurred within a session is essential for sequence analysis: funnel progression, path analysis, time-to-action calculations. The challenge in GA4 BigQuery exports is that event_timestamp alone doesn’t guarantee correct ordering. Multiple events can share the same microsecond timestamp, and the export doesn’t preserve the order they fired on the device.

This matters in practice. A page_view and a scroll event on the same page can share identical timestamps. An add_to_cart and a view_item might fire within the same millisecond during rapid user interaction. If your event sequencing relies only on timestamp, ties resolve arbitrarily, and your sequence analysis produces unreliable results.

The Three Batch Fields

GA4 provides three fields at the root level of the export schema specifically for resolving ordering ambiguity:

FieldDescriptionScope
batch_page_idSequential number assigned to each page, increasing across the user’s engagementCross-page
batch_ordering_idMonotonically increasing number, incremented with each network request from a pageWithin-page
batch_event_indexSequential order of each event within a batch, based on occurrence order on the deviceWithin-batch

These fields form a hierarchy. batch_page_id is the coarsest — it tells you which page the event belongs to. batch_ordering_id sequences the network requests within a page. batch_event_index orders events within a single network request.

Combined, they provide the exact sequence of events as they occurred on the user’s device, resolving timestamp ties deterministically.

The Compound ORDER BY Pattern

Use all three batch fields as tiebreakers after event_timestamp in your window function ORDER BY clauses:

ROW_NUMBER() OVER (
PARTITION BY session_key
ORDER BY event_timestamp, batch_page_id, batch_ordering_id, batch_event_index
) AS event__number_in_session

The ordering logic works in layers:

  1. event_timestamp provides coarse chronological ordering
  2. batch_page_id separates events by page when timestamps match
  3. batch_ordering_id separates network requests within a page
  4. batch_event_index provides the final within-batch ordering

This compound ORDER BY should be used consistently across all window functions in your sessionized model. If you define a named window, include all four ordering fields:

WINDOW w_ordered AS (
PARTITION BY session_key
ORDER BY event_timestamp, batch_page_id, batch_ordering_id, batch_event_index
)

Then reference w_ordered wherever deterministic ordering matters.

Derived Positional Columns

With reliable event ordering established, you can derive useful positional columns for downstream analysis:

Event number in session gives each event a sequential position:

ROW_NUMBER() OVER w_ordered AS event__number_in_session

Session start and end flags identify entry and exit events:

-- After computing event__number_in_session and session__events
event__number_in_session = 1 AS event__is_session_start,
event__number_in_session = session__events AS event__is_session_end

Where session__events comes from:

COUNT(*) OVER (PARTITION BY session_key) AS session__events

These flags power entry and exit analysis without aggregation. Filter to event__is_session_start = TRUE for landing page analysis, or event__is_session_end = TRUE for exit pages. No GROUP BY, no subquery, just a WHERE clause on the enriched event table.

Time elapsed since session start enables pacing analysis:

(event_timestamp - MIN(event_timestamp) OVER (PARTITION BY session_key)) / 1000000
AS event__seconds_since_session_start

This answers “how long after landing do users typically purchase?” directly from the event-grain sessionized table, without additional joins or CTEs.

When Batch Fields Are Null

The batch fields are not always populated. Events from older SDK versions, Measurement Protocol hits, and some edge cases may have null values for one or more batch fields. Your ORDER BY handles this gracefully — nulls sort to a consistent position (first by default in BigQuery), so the ordering remains deterministic even if not perfectly accurate for those events.

If null batch fields are common in your data, consider adding NULLS LAST to your ORDER BY:

ORDER BY
event_timestamp,
batch_page_id NULLS LAST,
batch_ordering_id NULLS LAST,
batch_event_index NULLS LAST

This ensures events with batch metadata sort before those without, which is typically the correct behavior since events with batch fields came from the standard web or app SDK.

Practical Impact

The difference between timestamp-only ordering and batch-field ordering is most visible in:

  • Funnel analysis: Where step order determines funnel completion. A user who viewed a product and then added to cart looks different from one who added to cart before viewing (which suggests a direct link or saved item).
  • Attribution modeling: Where the first event in a session determines the landing page and traffic source. Getting the wrong “first” event means wrong attribution.
  • Path analysis: Where common event sequences are identified. Timestamp ties that resolve inconsistently produce noisy paths.

For dashboards showing aggregate metrics (session counts, total conversions), batch-field ordering rarely changes the numbers. Its impact is concentrated in sequence-dependent analysis.