Channel grouping translates raw UTM source/medium combinations into business-readable categories: Organic Search, Paid Social, Direct, Email. GA4’s interface applies these groupings automatically using Google’s default rules. In a dbt project, you need to replicate this logic — and a macro is the right abstraction.
Why a Macro
Channel grouping is needed in at least two places: the sessionized events intermediate model and potentially any mart that includes channel attribution. Repeating the CASE statement would mean multiple maintenance points. A macro calls the logic once and generates the SQL wherever referenced.
The macro takes column references (not values) as arguments, which is the standard dbt pattern for SQL-generating macros:
-- macros/ga4/default_channel_grouping.sql
{% macro default_channel_grouping(source_column, medium_column) %} CASE -- Direct WHEN {{ source_column }} IS NULL OR {{ source_column }} = '(direct)' OR {{ source_column }} = '(none)' THEN 'Direct'
-- Paid Search WHEN REGEXP_CONTAINS({{ medium_column }}, r'^(cpc|ppc|paidsearch)$') OR ({{ source_column }} = 'google' AND {{ medium_column }} = 'sem') THEN 'Paid Search'
-- Paid Social WHEN REGEXP_CONTAINS({{ medium_column }}, r'^(paidsocial|paid-social|paid_social)$') OR (REGEXP_CONTAINS({{ source_column }}, r'facebook|instagram|linkedin|twitter|tiktok') AND REGEXP_CONTAINS({{ medium_column }}, r'^(cpc|ppc|paid)')) THEN 'Paid Social'
-- Display WHEN REGEXP_CONTAINS({{ medium_column }}, r'^(display|cpm|banner)$') THEN 'Display'
-- Organic Search WHEN {{ medium_column }} = 'organic' OR REGEXP_CONTAINS({{ source_column }}, r'google|bing|yahoo|duckduckgo|baidu') AND {{ medium_column }} = 'organic' THEN 'Organic Search'
-- Organic Social WHEN REGEXP_CONTAINS({{ source_column }}, r'facebook|instagram|linkedin|twitter|tiktok|pinterest|youtube') AND ({{ medium_column }} IS NULL OR {{ medium_column }} IN ('social', 'referral', '(none)')) THEN 'Organic Social'
-- Email WHEN REGEXP_CONTAINS({{ medium_column }}, r'^(email|e-mail|e_mail)$') THEN 'Email'
-- Affiliate WHEN REGEXP_CONTAINS({{ medium_column }}, r'^(affiliate|affiliates)$') THEN 'Affiliates'
-- Referral WHEN {{ medium_column }} = 'referral' THEN 'Referral'
-- Other ELSE 'Other' END{% endmacro %}Usage in the sessionized model:
{{ default_channel_grouping('session__source_final', 'session__medium_final') }} AS session__channel_groupingThe Case Order Matters
The CASE statement is evaluated top to bottom, first match wins. The ordering here encodes priority decisions:
- Direct first — Null source is always Direct, before any medium-based logic
- Paid before organic —
cpcmedium gets classified as Paid Search before the Organic Search rule can catch a Google source - Paid Social medium first, then source + medium — Teams using
paidsocialas medium get classified directly; teams usingcpcmedium with social network sources get caught by the second condition - Organic Social checks source + non-paid medium — Catches facebook/instagram with social/referral medium, but not if they have a paid medium (already caught above)
- Referral last — Falls through after all other channel classifications
Changing this order changes the output. If you put Referral before Organic Social, some social referral traffic will be classified as Referral instead.
Where Channel Grouping Diverges from GA4’s Interface
Your macro will not match GA4’s interface exactly. The interface uses Google’s full channel grouping ruleset which is more complex and includes platform-specific logic Google doesn’t publish completely. Expect some differences in the ‘Other’ bucket.
The most common divergence points:
Branded paid search vs non-branded. Google Ads can tag campaigns with branded or non-branded signals, and the interface may distinguish these as separate channels. Your macro classifies both as Paid Search.
Cross-network and Performance Max. Newer campaign types may use medium values not covered by the standard regex patterns. When you see traffic landing in ‘Other’ that you know is paid, check the raw source/medium values and add matching conditions.
YouTube. YouTube can appear as both Organic Social (organic YouTube views) and Paid Video (paid YouTube ads with video medium). The regex catches YouTube in Organic Social; make sure your paid YouTube campaigns use a medium like cpc or paid to be caught by the Paid Social or Display rules first.
Aligning with Your UTM Conventions
The macro encodes assumptions about your UTM taxonomy. If your team uses paid-social as a medium convention for social ads, the regex r'^(paidsocial|paid-social|paid_social)$' covers it. If you use something non-standard like social_paid, it’ll fall through to ‘Other’.
Before deploying, run a validation query:
SELECT session__source, session__medium, session__channel_grouping, COUNT(*) AS sessionsFROM mrt__analytics__sessionsWHERE session__channel_grouping = 'Other'GROUP BY 1, 2, 3ORDER BY 4 DESCLIMIT 50Any significant session volumes in ‘Other’ with recognizable sources deserve investigation. Either your UTM naming doesn’t match the expected patterns (a naming convention problem) or the macro needs an additional condition.
Testing the Output
Add an accepted_values test to the column in your YAML:
- name: session__channel_grouping tests: - accepted_values: values: - 'Direct' - 'Organic Search' - 'Paid Search' - 'Paid Social' - 'Organic Social' - 'Email' - 'Referral' - 'Display' - 'Affiliates' - 'Other'This test will warn (or error, depending on severity) if the macro generates an unexpected channel name. Useful as a guard against future modifications that introduce a typo or new channel category without updating the accepted values list.