ServicesAboutNotesContact Get in touch →
EN FR
Note

GA4 dbt Package Ecosystem

An overview of the major open-source dbt packages for GA4 BigQuery exports — what they optimize for, what they miss, and when to build custom.

Planted
ga4dbtbigquerydata modelinganalytics

The open-source community has produced several GA4 dbt packages with meaningful adoption. This note describes each package’s approach and trade-offs, including where a custom build is more appropriate.

Velir/dbt-ga4: The Community Standard

The Velir/dbt-ga4 package is the dominant option with 380+ GitHub stars and active development. Its creator positioned it as a home for communal knowledge around GA4 and BigQuery — an organic accumulation of patterns that actually work in production.

The architecture follows a clean base-to-marts flow:

models/
├── base/
│ ├── base__ga4__events
│ ├── stg_ga4__events
│ ├── stg_ga4__event_page_view
│ └── stg_ga4__sessions_traffic_sources
└── marts/
└── core/
├── dim_ga4__sessions
└── fct_ga4__sessions

Three patterns from Velir are worth borrowing regardless of whether you use the package itself:

Variable-driven parameter extraction. Rather than hardcoding which event parameters to extract, Velir uses dbt_project.yml variables:

vars:
ga4:
page_view_custom_parameters:
- name: "clean_event"
value_type: "string_value"

This makes the package configurable without code changes — a good pattern for any GA4 project where different properties track different custom parameters.

Static incremental lookback. Instead of dynamically querying the destination table for the latest date (which forces BigQuery to scan all sharded tables), Velir reprocesses the last N days on every run. See GA4 Sharded-to-Partitioned Base Model for why dynamic lookbacks are dangerous with date-sharded tables.

Composite session keys. Sessions are identified by user_pseudo_id + ga_session_id, not ga_session_id alone. See GA4 Session Key Construction for the collision mechanics.

The main limitation: Velir’s session-grain marts are not partitioned. At high traffic volumes, this makes downstream queries expensive. The documentation explicitly recommends disabling certain models for high-traffic sites — which is a sign you may need a custom approach.

Fivetran/dbt_ga4_export

The Fivetran package targets teams who ingest GA4 data through Fivetran rather than the native BigQuery export. It supports multiple warehouses (Snowflake, Redshift, Databricks, BigQuery) and handles the schema normalization that Fivetran applies during ingestion.

With only 7 models total, it’s intentionally minimal. Fivetran flattens event_params during ingestion, so the unnesting complexity that dominates raw BigQuery export work is already handled upstream. If you’re on Fivetran, this is likely your starting point.

The downside: you’re locked into Fivetran’s flattening decisions. Custom event parameters that Fivetran doesn’t know about may not surface cleanly.

MO Data Consulting’s Package

This package takes a distinctive approach: it dynamically flattens all event_params into individual columns at compile time. Jinja iterates over your event parameter list and generates a SELECT with one column per parameter. No subquery extraction, no UNNEST at query time.

Convenient for exploration when you want a clean tabular view of all parameters. But it can create unwieldy wide tables — and if your parameter list changes, you need to recompile rather than just adjusting your extraction logic.

admindio/simple_ga4_dbt

The most opinionated of the bunch. Built for cost-efficiency and simplicity, with built-in attribution models and minimal package dependencies. Works well for teams who want something running quickly without extensive customization.

The “simple” in the name is accurate — it trades configurability for ease of setup.

When to Build Custom

All four packages optimize toward session-grain marts: one row per session with aggregated metrics. That’s the most common GA4 reporting pattern, and it makes sense as a default.

The limitation reveals itself when you need event-level analysis with session context — funnel analysis, event-sequence questions, time-to-conversion calculations. Session-grain tables require joining back to raw events for these queries, adding complexity and consistency risk.

The alternative architecture — a wide event-grain intermediate table where every row carries session context via window functions — isn’t well-served by any existing package. None of them build this kind of enriched event table as their primary output.

You also might want custom if:

  • Your GA4 property tracks many custom events with non-standard parameter names
  • You need specific channel grouping logic that differs from Google’s defaults
  • Your ecommerce item tracking requires item-level grain models
  • You want tighter control over which events get materialized and at what cost

The packages are useful references for understanding patterns — especially Velir’s incremental strategy and session key construction — even when the package itself is not used directly.