Removal Effect in Attribution

The removal effect is the mathematical core of data-driven attribution. It answers the question that heuristic models can’t: if this channel didn’t exist, how many fewer conversions would we see?

Both Markov chain and Shapley value attribution use removal effects to measure channel contribution. The approaches differ in how they calculate the counterfactual (what happens without the channel), but the underlying principle is the same: a channel’s value equals the damage caused by its absence.

The formula

The removal effect for a channel is calculated as:

Removal Effect(Channel X) = 1 - (P(conversion without X) / P(conversion with all channels))

Where:

P(conversion with all channels) is the baseline conversion probability using all available channels
P(conversion without X) is the conversion probability when channel X is removed from the journey graph

A removal effect of 30% means conversion probability drops by 30% when that channel is removed. A removal effect of 0% means the channel adds no measurable value — users find other paths to convert without it.

Worked example

Suppose your transition matrix produces these results:

Total conversion probability with all channels: 50%
Conversion probability without Paid Search: 35%
Conversion probability without Email: 45%
Conversion probability without Direct: 48%

The removal effects would be:

Paid Search: 1 - (0.35 / 0.50) = 30%
Email: 1 - (0.45 / 0.50) = 10%
Direct: 1 - (0.48 / 0.50) = 4%

Paid Search has the highest removal effect — removing it causes the biggest drop in conversion probability. This means Paid Search is the most critical channel in your journey graph, regardless of where it appears in the journey sequence.

Why removal effects don’t sum to 100%

These percentages sum to 44%, not 100%. This is expected and correct. Removal effects measure marginal contribution, and those contributions overlap. Removing Paid Search affects paths that also included Email. The two channels interact — they’re not independent.

This is actually a feature, not a limitation. The overlap tells you something important about channel interdependence. If two channels have high individual removal effects but their combined removal effect is only slightly higher than either alone, they serve similar roles in the journey. Users substitute one for the other.

Normalizing to attribution shares

To get attribution shares that sum to 100% of conversions, normalize the removal effects:

Total Removal Effects = 30% + 10% + 4% = 44%

Attribution Share(Paid Search) = 30% / 44% = 68%
Attribution Share(Email)       = 10% / 44% = 23%
Attribution Share(Direct)      = 4%  / 44% = 9%

Now multiply these shares by your total conversions to get attributed conversions per channel. If you had 1,000 conversions, Paid Search gets 680, Email gets 230, and Direct gets 90.

The normalization step is what makes removal effect attribution comparable to heuristic models — it produces numbers that sum to your actual conversion total, just like first-touch, last-touch, or linear attribution.

How removal effects differ from position-based credit

Consider a channel that appears in many converting journeys but always in the middle of long paths. Under position-based attribution, it gets at most a small share of the 20% allocated to middle touches. Under last-touch, it gets nothing. Under first-touch, nothing again.

But if removing that channel from the journey graph causes a 25% drop in conversion probability — because it serves as a critical bridge between awareness and conversion — its removal effect captures this importance. The channel is valuable not because of where it sits, but because of what happens without it.

This is the fundamental shift from heuristic to data-driven attribution. Position-based models reward where a channel appears. Removal effects reward what a channel does.

Removal effects vs. incrementality

The removal effect is a model-based estimate of what would happen if a channel were removed. Incrementality testing actually removes the channel (via holdout tests) and measures the real-world impact.

The two should be directionally aligned. If Markov attribution gives Paid Search a 30% removal effect, and a holdout test shows 25% incremental lift, the model is reasonably calibrated. If the removal effect says 30% but incrementality says 5%, either the model is over-crediting the channel (perhaps it appears in many journeys but doesn’t cause conversions) or the incrementality test had design flaws.

Use incrementality results to build intuition about where your removal effect calculations are trustworthy and where they might overstate a channel’s contribution. Over time, this calibration makes the model outputs more actionable even without running continuous experiments.

Implementation context

In Markov chain attribution, the removal effect is calculated by removing a channel’s row and column from the transition matrix and recomputing the probability of reaching CONVERSION from START through all remaining paths. This requires matrix operations that go beyond SQL’s capabilities — typically handled by Python packages like ChannelAttribution (R) or marketing-attribution-models (Python).

In Shapley value attribution, the removal effect is implicit in the marginal contribution calculation. For every possible subset of channels, Shapley values compute how much adding a channel increases conversion probability — which is the inverse perspective on removal. Instead of “what do we lose without it?”, Shapley asks “what do we gain by adding it?” The answers converge to the same insight through different mathematical paths.