ServicesAboutNotesContact Get in touch →
EN FR
Note

GA4 dbt Project Template

Hub connecting all concepts in building a production-ready dbt project for GA4 BigQuery exports — from base model to marts, with testing and documentation.

Planted
ga4dbtbigquerydata modelinganalyticsincremental processing

Building a GA4 dbt project isn’t complicated, but it has enough GA4-specific nuance that starting from general dbt knowledge isn’t quite enough. The sharded export table breaks standard incremental patterns. Nested event parameters require macro-based extraction. Session context needs careful window function framing. This hub connects the notes that address each layer of the problem.

The central design decision: a wide event-grain intermediate table where every row carries session context. Instead of building session-grain tables directly (the approach most open-source packages take), this architecture enriches every event with landing page, traffic source, and conversion flags via window functions. The session mart becomes a trivial GROUP BY. Funnel analysis and event-sequence questions work without joins.

This is the pattern decomposed from GA4 + dbt: A Production-Ready Project Template.

Before You Build: The Existing Package Landscape

GA4 dbt Package Ecosystem — The major open-source GA4 dbt packages (Velir/dbt-ga4, Fivetran, admindio), what each optimizes for, and the patterns worth borrowing even if you build custom. Start here to understand the design space.

Project Setup

GA4 dbt Project Configuration — The complete dbt_project.yml with variable-driven behavior, folder-level materializations, and test defaults. How to structure the project for reuse across multiple GA4 properties.

The Base Layer

GA4 Sharded-to-Partitioned Base Model — How GA4’s date-sharded export (events_YYYYMMDD) breaks standard incremental patterns, and the _TABLE_SUFFIX filter + static lookback approach that makes BigQuery partition pruning work correctly.

GA4 Parameter Extraction Macro — The extract_event_param dbt macro that encapsulates the correlated subquery pattern for event_params extraction. Why correlated subqueries rather than CROSS JOIN UNNEST, and the numeric variant for int/float/double fields.

GA4 Ecommerce Items UNNEST Pattern — Building a separate item-level grain model with intentional Cartesian UNNEST. When GA4’s nested items array needs its own model, and how item aggregates connect back to the event-level pipeline.

The Intermediate Layer

GA4 Events Sessionized Model — The CTE-by-CTE implementation of int__ga4__events_sessionized: URL cleaning, session context propagation via window functions, channel grouping, and item aggregation joins. The workhorse of the entire project.

GA4 Window Function Pitfalls — Three GA4-specific window function traps: the LAST_VALUE framing problem, IGNORE NULLS for sparse event data, and MAX for boolean session flag propagation.

GA4 Channel Grouping Macro — Google’s default channel grouping logic as a reusable dbt macro. The CASE ordering, regex patterns, and how to validate against your UTM conventions.

The Mart Layer

GA4 User Mart Pattern — Building a user-grain mart from session data with first/last touch attribution, lifetime value aggregation, and user_id stitching across user_pseudo_id values.

For the session mart, see Event-Grain Sessionization — once the sessionized model exists, the session mart is a GROUP BY with ANY_VALUE.

Testing

GA4-Specific dbt Testing Patterns — Tests that standard dbt schema tests miss: missing session_start events, orphaned transactions, suspicious session metrics. Source freshness configuration and singular test patterns.