Campaign Naming and UTM Standardization

Campaign names are a primary source of human error in cross-platform reporting. Without consistent naming, there is no reliable way to group campaigns across platforms for comparison. Solving it requires buy-in from the media buying team, not just the data team.

The Three Approaches (Use All Three)

In practice, most teams end up using all three of these approaches simultaneously: conventions for new campaigns, regex for the bulk of existing ones, and seed overrides for the exceptions.

1. Enforce Naming Conventions at Creation Time

The most reliable approach. Define a convention that embeds metadata in campaign names:

{platform}_{objective}_{audience}_{geo}_{date}

Examples:

google_conversions_retargeting_us_202603
meta_awareness_lookalike_uk_202603
linkedin_leads_decision-makers_global_202603

The convention should be:

Documented in a place the media buying team actually checks (not buried in a wiki)
Enforced through platform-level naming templates where possible
Validated by the data team through automated checks on new campaign names

This requires buy-in from the media buying team. If they don’t follow the convention, the data team can’t parse what they can’t parse. Make the case that consistent naming is what enables the cross-platform insights they’re asking for.

2. Parse Campaign Names with Regex in dbt

When conventions are mostly followed but not perfectly — which is the reality for most teams — regex parsing extracts structure from semi-structured names.

SELECT
  campaign__name,
  REGEXP_EXTRACT(campaign__name, r'^([a-z]+)_') AS campaign__parsed_platform,
  REGEXP_EXTRACT(campaign__name, r'_([a-z]+)_[a-z]+_') AS campaign__parsed_objective,
  REGEXP_EXTRACT(campaign__name, r'_([a-z]+)_\d{6}$') AS campaign__parsed_geo,
  REGEXP_EXTRACT(campaign__name, r'_(\d{6})$') AS campaign__parsed_date
FROM {{ ref('int__google_ads__campaign_report') }}

This works well for campaigns that follow the convention and produces NULLs for campaigns that don’t. The NULLs are a signal — they identify campaigns that need manual mapping.

A few regex parsing tips:

Always lowercase first before parsing, since campaign names might have inconsistent casing
Test against a sample of real campaign names before deploying — edge cases are guaranteed
Log the parse failure rate so you know how many campaigns are falling through to seed overrides

3. Seed File Mapping Table

For campaigns that don’t follow any convention — legacy campaigns, partner-created campaigns, campaigns from agencies that use their own naming — a dbt seed file provides explicit overrides.

platform,original_campaign_name,standardized_objective,standardized_audience,standardized_geo
google_ads,Brand - Summer 2025 [Exact],conversions,brand,us
facebook_ads,FB_Retargeting_old_format,conversions,retargeting,us
linkedin_ads,LI ABM Campaign - Enterprise,leads,enterprise,global

Reference the seed in your intermediate model:

WITH campaign_base AS (
  SELECT
    campaign__id,
    campaign__name,
    REGEXP_EXTRACT(campaign__name, r'^([a-z]+)_') AS campaign__parsed_objective
  FROM {{ ref('int__google_ads__campaign_report') }}
),

overrides AS (
  SELECT * FROM {{ ref('campaign_name_overrides') }}
  WHERE platform = 'google_ads'
)

SELECT
  cb.campaign__id,
  cb.campaign__name,
  COALESCE(o.standardized_objective, cb.campaign__parsed_objective) AS campaign__objective,
  COALESCE(o.standardized_audience, NULL) AS campaign__audience,
  COALESCE(o.standardized_geo, NULL) AS campaign__geo
FROM campaign_base cb
LEFT JOIN overrides o
  ON cb.campaign__name = o.original_campaign_name

The COALESCE pattern means: use the override if one exists, otherwise fall back to the regex-parsed value. This layered approach handles the full spectrum from perfectly-named campaigns to total naming chaos.

UTM Parameter Hygiene

UTM parameters are the cross-platform bridge between ad spend and web analytics conversions. The utm_source, utm_medium, and utm_campaign parameters let you join ad platform spend data with GA4 session data, creating an attribution model independent of each platform’s self-reported conversions.

Critical UTM Rules

Always lowercase. UTMs are case-sensitive in GA4. utm_source=Facebook and utm_source=facebook create two separate sources. Enforce lowercase at the ad platform level and add a LOWER() in your dbt models as a safety net.

Use hyphens, not spaces. Spaces in URLs get encoded as %20, which looks ugly in reports and creates matching problems. Use hyphens or underscores consistently.

Maintain consistent platform naming. Decide once whether you use “facebook” or “fb” or “meta” as the utm_source value, and stick with it everywhere. Document the canonical values and enforce them.

Use dynamic parameters where available. Google supports {campaignid}, {adgroupid}, and {creative} as dynamic UTM values that auto-populate with the actual IDs. This eliminates human error for the most granular UTM parameters:

utm_source=google&utm_medium=cpc&utm_campaign={campaignid}&utm_content={adgroupid}&utm_term={keyword}

Meta supports similar dynamic parameters: {{campaign.name}}, {{adset.name}}, {{ad.name}}. LinkedIn supports {campaign_id} and {creative_id}. Use these wherever possible — they’re more reliable than manually typed values.

UTMs as the Attribution Key

Fivetran’s ad_reporting__url_report model includes UTM parameter tracking natively. The dbt Labs Attribution Playbook uses web sessions joined on UTM parameters as the key between ad spend and conversions for positional attribution models.

The join pattern:

-- Join ad spend with GA4 sessions via UTM parameters
SELECT
  ga4.date_day,
  ga4.utm_source,
  ga4.utm_medium,
  ga4.utm_campaign,
  SUM(ga4.sessions) AS sessions,
  SUM(ga4.conversions) AS web_conversions,
  SUM(ads.spend) AS ad_spend
FROM {{ ref('int__ga4__sessions') }} ga4
LEFT JOIN {{ ref('mrt__marketing__campaign_report') }} ads
  ON ga4.date_day = ads.date_day
  AND LOWER(ga4.utm_source) = ads.utm_source
  AND LOWER(ga4.utm_campaign) = ads.utm_campaign
GROUP BY 1, 2, 3, 4

This gives you an attribution model that’s independent of what each platform claims it drove. The UTM parameters tie the ad click to the web session to the conversion, using a consistent methodology across all channels.

Why This Matters for Cross-Platform Reporting

Without standardized campaign names, a unified model can report how much each platform spent but not how much was spent on retargeting across all platforms. Without clean UTM parameters, ad spend data cannot be joined with web analytics to calculate ROAS.

Campaign naming and UTM hygiene are the prerequisite for cross-platform comparison. The technical modeling work — the UNION, the metric normalization, the intermediate layer — assumes that campaigns can be meaningfully grouped and that UTM parameters can be reliably joined.