GA4 ecommerce data lives in the items field — a repeated RECORD containing one element per product in a given event. A single purchase event for a three-item cart produces three elements in the items array. Building accurate ecommerce analysis requires expanding this array into a separate item-level grain model.
Why a Separate Model
The rest of your GA4 pipeline works at event grain: one row per event. The sessionized events table maintains this grain carefully — correlated subqueries for event_params specifically because they don’t expand rows.
items is different. You genuinely need one row per item per event. A purchase with three products needs three rows to track each product’s revenue, quantity, and category correctly. Keeping items embedded in the event-grain table isn’t feasible — you’d need to aggregate them into arrays or denormalized columns, losing the ability to query by product.
The solution is a dedicated intermediate model at item-level grain, materialized as a view (cheap because it references an already-materialized incremental base model).
The Model
-- models/intermediate/ga4/int__ga4__event_items.sql
{{ config( materialized='view' )}}
WITH events_with_items AS (
SELECT event__key, event__date, event__timestamp_utc, user__pseudo_id, session__key, event__name, transaction__id, items__array FROM {{ ref('base__ga4__events') }} WHERE event__name IN ('purchase', 'add_to_cart', 'remove_from_cart', 'view_item', 'begin_checkout', 'add_payment_info') AND ARRAY_LENGTH(items__array) > 0
)
SELECT e.event__key, e.event__date, e.event__timestamp_utc, e.user__pseudo_id, e.session__key, e.event__name, e.transaction__id,
-- Item details item.item_id AS item__id, item.item_name AS item__name, item.item_brand AS item__brand, item.item_category AS item__category, item.item_category2 AS item__category2, item.item_category3 AS item__category3, item.item_variant AS item__variant, item.price AS item__price, item.quantity AS item__quantity, item.coupon AS item__coupon, item.item_list_name AS item__list_name, item.item_list_index AS item__list_index,
-- Calculate item revenue COALESCE(item.price, 0) * COALESCE(item.quantity, 1) AS item__revenue
FROM events_with_items e,UNNEST(e.items__array) AS itemThis model intentionally uses the Cartesian UNNEST pattern (FROM events_with_items e, UNNEST(e.items__array) AS item) rather than a correlated subquery. The row expansion is exactly what you want: one row per item per event.
The Event Filter
Not all GA4 events carry item data. The filter event__name IN ('purchase', 'add_to_cart', ...) does two things:
- Reduces the input size significantly — only ecommerce events have items
- Makes the model’s purpose explicit
The ARRAY_LENGTH(items__array) > 0 guard prevents UNNEST from producing rows with a null item when the array is empty. Some GA4 implementations send ecommerce events without item arrays — this prevents those from appearing in the model as rows with all-null item fields.
Connecting Items Back to Sessions
The item model preserves event__key and session__key from the base model. This makes joins natural:
-- Item revenue by session channelSELECT s.session__channel_grouping, SUM(i.item__revenue) AS total_revenue, COUNT(DISTINCT i.transaction__id) AS transactionsFROM mrt__analytics__sessions sJOIN int__ga4__event_items i ON s.session__key = i.session__keyWHERE i.event__name = 'purchase'GROUP BY 1ORDER BY 2 DESCThe sessionized events model aggregates item data back to event level:
-- In int__ga4__events_sessionizedevent_items AS ( SELECT event__key, COUNT(*) AS event__items, SUM(item__revenue) AS event__items_revenue FROM {{ ref('int__ga4__event_items') }} GROUP BY event__key)This pre-aggregation avoids repeated joins downstream. The sessionized table carries event__items and event__items_revenue as simple integer and float columns on every purchase event — no join required for standard ecommerce metrics.
Item Categories in GA4
GA4 supports a five-level product category hierarchy (item_category through item_category5). In practice, most implementations only use the first two or three levels. The model extracts all three meaningful levels; adjust based on what your tracking implementation actually populates.
Since October 2023, GA4 also supports item_params — a nested repeated field inside each item for custom product dimensions. This adds another layer of UNNEST complexity if your implementation uses custom item parameters. Handle it with a separate extraction CTE or an additional model if needed.
Materialization Choice: View
The item model is a view, not an incremental table. This is intentional:
- Storage cost avoided: The base model is already materialized as an incremental table. A view on top adds no storage cost.
- Freshness automatic: The view always reflects the latest state of the base model. No incremental logic needed.
- Query cost acceptable: Ecommerce item queries are typically run on bounded date ranges. BigQuery’s partition pruning on the base model limits scan costs regardless of the view layer.
If item-level queries become expensive at scale, materialize the view as an incremental table with the same lookback pattern as the base model. But start with a view — the simplicity is worth it.