Linear attribution treats every touchpoint equally. That’s a reasonable starting point, but it ignores something obvious: the first touchpoint that discovered a customer and the last touchpoint that closed the deal probably matter more than the random email they opened in between.
Position-based and time-decay models address this by assigning different weights to touchpoints based on where they fall in the journey or how recently they occurred. These models require slightly more complex SQL but produce attribution that better reflects how marketing actually works.
Position-based models: weighting by journey stage
Position-based models assign credit based on a touchpoint’s location in the customer journey. The most common is the U-shaped model, which emphasizes first and last touches while still crediting the middle.
The U-shaped model (40-20-40)
The U-shaped model distributes credit as follows:
- First touchpoint: 40%
- Last touchpoint: 40%
- All middle touchpoints: 20% split equally
The formula:
Credit(first) = 0.40Credit(last) = 0.40Credit(middle_i) = 0.20 / (n - 2)This model works well for most B2C and B2B scenarios where you want to value both the channel that drove initial awareness and the channel that triggered conversion.
The W-shaped model (30-30-30-10)
B2B companies with longer sales cycles often use W-shaped attribution, which adds weight to a key middle touchpoint like lead creation or demo request:
Credit(first) = 0.30Credit(key_middle) = 0.30Credit(last) = 0.30Credit(other) = 0.10 / (n - 3)W-shaped requires identifying that key middle moment in your data, which adds implementation complexity.
Edge cases
Single-touchpoint conversions get 100% credit to that touchpoint. Two-touchpoint conversions split credit 50/50. Your SQL needs to handle these explicitly, or you’ll get incorrect results.
U-shaped attribution in BigQuery
A complete implementation of the 40-20-40 U-shaped model:
WITH touchpoints_positioned AS ( SELECT user_id, transaction_id, channel, revenue, touchpoint_timestamp, ROW_NUMBER() OVER ( PARTITION BY user_id, transaction_id ORDER BY touchpoint_timestamp ASC ) AS position, COUNT(*) OVER ( PARTITION BY user_id, transaction_id ) AS total_touches FROM touchpoints WHERE touchpoint_timestamp >= TIMESTAMP_SUB( conversion_timestamp, INTERVAL 30 DAY ))SELECT user_id, transaction_id, channel, CASE WHEN total_touches = 1 THEN 1.0 WHEN total_touches = 2 THEN 0.5 WHEN position = 1 THEN 0.4 WHEN position = total_touches THEN 0.4 ELSE 0.2 / (total_touches - 2) END * revenue AS attributed_revenueFROM touchpoints_positionedWhat each piece does:
ROW_NUMBER()identifies each touchpoint’s position in the journeyCOUNT(*) OVER()gives us the total touchpoints per conversion- The
CASEstatement handles edge cases first, then applies the 40-20-40 split - Multiplying the weight by revenue produces the attributed amount
For a customer with 5 touchpoints and a $100 conversion:
- Touch 1: $40 (40%)
- Touch 2: $6.67 (20% ÷ 3)
- Touch 3: $6.67 (20% ÷ 3)
- Touch 4: $6.67 (20% ÷ 3)
- Touch 5: $40 (40%)
Industry weight variations
The 40-20-40 split isn’t universal. Adjust based on your business:
| Industry | Recommended Weights |
|---|---|
| E-commerce/retail | Standard 40-20-40 |
| B2B SaaS | W-shaped (30-30-30-10) with demo request |
| High-consideration products | 45-10-45 to emphasize endpoints |
Time-decay attribution: weighting by recency
Time-decay attribution assigns more credit to touchpoints that occurred closer to the conversion. A touchpoint from yesterday matters more than one from two weeks ago.
The exponential decay formula
Time-decay uses exponential decay with a half-life parameter:
Weight(touchpoint) = 2^(-days_before_conversion / half_life)The half-life determines how quickly credit diminishes. With a 7-day half-life:
- Day 0 (conversion day): Weight = 1.0 (100%)
- Day 7: Weight = 0.5 (50%)
- Day 14: Weight = 0.25 (25%)
- Day 21: Weight = 0.125 (12.5%)
Google Analytics uses a 7-day half-life by default. This works reasonably well for e-commerce but may be too aggressive for B2B.
Choosing the right half-life
Half-life should roughly match your typical sales cycle length:
| Industry | Half-Life | Lookback Window |
|---|---|---|
| B2C E-commerce (impulse) | 3-7 days | 7-14 days |
| B2C E-commerce (considered) | 7-14 days | 30-45 days |
| B2B Mid-Market | 14-30 days | 90-180 days |
| B2B Enterprise | 30-45 days | 180+ days |
A half-life that’s too short under-credits early-funnel channels. One that’s too long approaches linear attribution, defeating the purpose.
Time-decay attribution in BigQuery
The implementation calculates raw weights, then normalizes so they sum to 1:
WITH decay_weights AS ( SELECT user_id, transaction_id, channel, revenue, conversion_timestamp, touchpoint_timestamp, POW( 0.5, TIMESTAMP_DIFF(conversion_timestamp, touchpoint_timestamp, MINUTE) / (7.0 * 24 * 60) ) AS raw_weight FROM touchpoints WHERE touchpoint_timestamp >= TIMESTAMP_SUB( conversion_timestamp, INTERVAL 30 DAY )),normalized AS ( SELECT *, SUM(raw_weight) OVER ( PARTITION BY user_id, transaction_id ) AS total_weight FROM decay_weights)SELECT user_id, transaction_id, channel, (raw_weight / total_weight) * revenue AS attributed_revenueFROM normalizedWhat each piece does:
TIMESTAMP_DIFF(..., MINUTE)calculates the time gap in minutes for precision7.0 * 24 * 60converts the 7-day half-life to minutes (10,080 minutes)POW(0.5, time_ratio)applies exponential decay- Dividing by
total_weightnormalizes so all weights sum to 1
For a customer with touchpoints at day -14, day -7, and day 0:
- Day -14: raw_weight = 0.25, normalized = 0.25/1.75 = 14.3%
- Day -7: raw_weight = 0.5, normalized = 0.5/1.75 = 28.6%
- Day 0: raw_weight = 1.0, normalized = 1.0/1.75 = 57.1%
Parameterizing the half-life
Hard-coding the half-life makes experimentation difficult. Use a variable or configuration table:
DECLARE half_life_days FLOAT64 DEFAULT 7.0;
WITH decay_weights AS ( SELECT user_id, transaction_id, channel, revenue, conversion_timestamp, touchpoint_timestamp, POW( 0.5, TIMESTAMP_DIFF(conversion_timestamp, touchpoint_timestamp, HOUR) / (half_life_days * 24) ) AS raw_weight FROM touchpoints WHERE touchpoint_timestamp >= TIMESTAMP_SUB( conversion_timestamp, INTERVAL 30 DAY ))-- ... rest of queryThis lets you test different half-life values and compare results.
Building both models in dbt
Add these models to your existing attribution marts structure:
models/├── intermediate/│ └── int__touchpoints.sql├── marts/attribution/│ ├── mrt__attribution_first_touch.sql│ ├── mrt__attribution_last_touch.sql│ ├── mrt__attribution_linear.sql│ ├── mrt__attribution_position_based.sql # New│ └── mrt__attribution_time_decay.sql # NewPosition-based model with configurable weights
Use dbt variables to make weights configurable:
-- mrt__attribution_position_based.sql{% set first_touch_weight = var('attribution_first_weight', 0.4) %}{% set last_touch_weight = var('attribution_last_weight', 0.4) %}{% set middle_weight = 1.0 - first_touch_weight - last_touch_weight %}
WITH touchpoints_positioned AS ( SELECT user_id, transaction_id, channel, source, medium, revenue, touchpoint_timestamp, ROW_NUMBER() OVER ( PARTITION BY user_id, transaction_id ORDER BY touchpoint_timestamp ASC ) AS position, COUNT(*) OVER ( PARTITION BY user_id, transaction_id ) AS total_touches FROM {{ ref('int__touchpoints') }})SELECT user_id, transaction_id, channel, source, medium, touchpoint_timestamp, CASE WHEN total_touches = 1 THEN 1.0 WHEN total_touches = 2 THEN 0.5 WHEN position = 1 THEN {{ first_touch_weight }} WHEN position = total_touches THEN {{ last_touch_weight }} ELSE {{ middle_weight }} / (total_touches - 2) END * revenue AS attributed_revenue, 'position_based' AS attribution_modelFROM touchpoints_positionedConfigure weights in dbt_project.yml:
vars: attribution_first_weight: 0.4 attribution_last_weight: 0.4 attribution_half_life_days: 7Time-decay model with configurable half-life
-- mrt__attribution_time_decay.sql{% set half_life_days = var('attribution_half_life_days', 7) %}
WITH decay_weights AS ( SELECT user_id, transaction_id, channel, source, medium, revenue, touchpoint_timestamp, conversion_timestamp, POW( 0.5, TIMESTAMP_DIFF(conversion_timestamp, touchpoint_timestamp, HOUR) / ({{ half_life_days }} * 24.0) ) AS raw_weight FROM {{ ref('int__touchpoints') }}),normalized AS ( SELECT *, SUM(raw_weight) OVER ( PARTITION BY user_id, transaction_id ) AS total_weight FROM decay_weights)SELECT user_id, transaction_id, channel, source, medium, touchpoint_timestamp, (raw_weight / total_weight) * revenue AS attributed_revenue, 'time_decay' AS attribution_modelFROM normalizedTesting weighted models
Your tests should verify that weights sum correctly:
-- tests/attribution_weights_sum_to_revenue.sqlWITH model_totals AS ( SELECT transaction_id, SUM(attributed_revenue) AS total_attributed FROM {{ ref('mrt__attribution_position_based') }} GROUP BY transaction_id),actual_revenue AS ( SELECT transaction_id, MAX(revenue) AS actual_revenue FROM {{ ref('int__touchpoints') }} GROUP BY transaction_id)SELECT m.transaction_id, m.total_attributed, a.actual_revenueFROM model_totals mJOIN actual_revenue a ON m.transaction_id = a.transaction_idWHERE ABS(m.total_attributed - a.actual_revenue) > 0.01This returns rows only when attribution doesn’t match actual revenue, making it easy to spot calculation errors.
Choosing between position-based and time-decay
These models make different assumptions about what matters in a customer journey.
Use position-based when:
- You believe first and last touches are genuinely more impactful
- Your funnel has clear “discovery” and “conversion” moments
- You want to credit both awareness and closing channels
- Sales cycles are relatively consistent in length
Use time-decay when:
- Recent touchpoints demonstrably influence conversion more
- You’re optimizing for immediate conversion impact
- Sales cycles vary widely in length
- You want to weight urgency and recency into the model
Consider running both and comparing results. When position-based and time-decay produce similar channel rankings, you have higher confidence in the findings. When they diverge significantly, that’s a signal to investigate why.
Moving toward data-driven models
Position-based and time-decay are improvements over simple first/last-touch, but they still rely on assumptions about how credit should distribute. The weights are business decisions, not derived from your actual data.
Data-driven models like Markov chains and Shapley values calculate weights based on observed conversion patterns. They answer: “Given our historical data, how much did each channel actually contribute to conversions?”
These models require more data (hundreds of conversions minimum) and more complex implementation, but they provide the most defensible attribution when you have sufficient volume.
For most companies, building solid first-touch, last-touch, linear, position-based, and time-decay models provides a strong foundation. Run them in parallel, compare results across models, and use the divergence to understand where your attribution has high confidence versus uncertainty.