Building a GA4 dbt project isn’t complicated, but it has enough GA4-specific nuance that starting from general dbt knowledge isn’t quite enough. The sharded export table breaks standard incremental patterns. Nested event parameters require macro-based extraction. Session context needs careful window function framing. This hub connects the notes that address each layer of the problem.
The central design decision: a wide event-grain intermediate table where every row carries session context. Instead of building session-grain tables directly (the approach most open-source packages take), this architecture enriches every event with landing page, traffic source, and conversion flags via window functions. The session mart becomes a trivial GROUP BY. Funnel analysis and event-sequence questions work without joins.
This is the pattern decomposed from GA4 + dbt: A Production-Ready Project Template.
Before You Build: The Existing Package Landscape
GA4 dbt Package Ecosystem — The major open-source GA4 dbt packages (Velir/dbt-ga4, Fivetran, admindio), what each optimizes for, and the patterns worth borrowing even if you build custom. Start here to understand the design space.
Project Setup
GA4 dbt Project Configuration — The complete dbt_project.yml with variable-driven behavior, folder-level materializations, and test defaults. How to structure the project for reuse across multiple GA4 properties.
The Base Layer
GA4 Sharded-to-Partitioned Base Model — How GA4’s date-sharded export (events_YYYYMMDD) breaks standard incremental patterns, and the _TABLE_SUFFIX filter + static lookback approach that makes BigQuery partition pruning work correctly.
GA4 Parameter Extraction Macro — The extract_event_param dbt macro that encapsulates the correlated subquery pattern for event_params extraction. Why correlated subqueries rather than CROSS JOIN UNNEST, and the numeric variant for int/float/double fields.
GA4 Ecommerce Items UNNEST Pattern — Building a separate item-level grain model with intentional Cartesian UNNEST. When GA4’s nested items array needs its own model, and how item aggregates connect back to the event-level pipeline.
The Intermediate Layer
GA4 Events Sessionized Model — The CTE-by-CTE implementation of int__ga4__events_sessionized: URL cleaning, session context propagation via window functions, channel grouping, and item aggregation joins. The workhorse of the entire project.
GA4 Window Function Pitfalls — Three GA4-specific window function traps: the LAST_VALUE framing problem, IGNORE NULLS for sparse event data, and MAX for boolean session flag propagation.
GA4 Channel Grouping Macro — Google’s default channel grouping logic as a reusable dbt macro. The CASE ordering, regex patterns, and how to validate against your UTM conventions.
The Mart Layer
GA4 User Mart Pattern — Building a user-grain mart from session data with first/last touch attribution, lifetime value aggregation, and user_id stitching across user_pseudo_id values.
For the session mart, see Event-Grain Sessionization — once the sessionized model exists, the session mart is a GROUP BY with ANY_VALUE.
Testing
GA4-Specific dbt Testing Patterns — Tests that standard dbt schema tests miss: missing session_start events, orphaned transactions, suspicious session metrics. Source freshness configuration and singular test patterns.
Related Concepts
- GA4 Sessionization Hub — All concepts around building sessions from GA4 event data
- Event-Grain Sessionization — Why enriched events beat session-grain tables
- GA4 Session Key Construction — The composite key requirement
- Late-Arriving Data and the Lookback Window Pattern — The general lookback window pattern
- GA4 BigQuery Number Discrepancies — Why numbers will differ from the GA4 interface