Unit Testing Conversion Funnels in dbt

Funnel analysis tracks how users progress through a defined sequence of steps — page view, add to cart, checkout, purchase. At each step, some users drop off. The conversion rate between steps is a key metric for identifying friction points and prioritizing product improvements.

The logic involves counting distinct users at each step, calculating step-over-step conversion rates (what percentage of users from step N reach step N+1), and handling users who skip steps. Getting the denominator wrong in conversion rate calculations is the most common bug — dividing by total users instead of previous-step users, for instance.

The Drop-Off Testing Pattern

The most effective way to test funnel logic is to create users who drop off at each step. This gives you a known, traceable set of numbers:

unit_tests:
  - name: test_mrt_marketing_conversion_funnel_dropoff
    model: mrt__marketing__conversion_funnel
    description: "Funnel should calculate conversion rates between steps"
    given:
      - input: ref('base__ga4__events')
        rows:
          # User 1: completes full funnel
          - {user_id: 1, event_name: "page_view", event_date: "2024-06-01"}
          - {user_id: 1, event_name: "add_to_cart", event_date: "2024-06-01"}
          - {user_id: 1, event_name: "begin_checkout", event_date: "2024-06-01"}
          - {user_id: 1, event_name: "purchase", event_date: "2024-06-01"}
          # User 2: drops at checkout
          - {user_id: 2, event_name: "page_view", event_date: "2024-06-01"}
          - {user_id: 2, event_name: "add_to_cart", event_date: "2024-06-01"}
          - {user_id: 2, event_name: "begin_checkout", event_date: "2024-06-01"}
          # User 3: drops at cart
          - {user_id: 3, event_name: "page_view", event_date: "2024-06-01"}
          - {user_id: 3, event_name: "add_to_cart", event_date: "2024-06-01"}
          # User 4: only views
          - {user_id: 4, event_name: "page_view", event_date: "2024-06-01"}
    expect:
      rows:
        - {funnel_step: "page_view", user_count: 4, step_conversion_rate: 1.0}
        - {funnel_step: "add_to_cart", user_count: 3, step_conversion_rate: 0.75}
        - {funnel_step: "begin_checkout", user_count: 2, step_conversion_rate: 0.67}
        - {funnel_step: "purchase", user_count: 1, step_conversion_rate: 0.5}

Let’s trace through the numbers:

User	page_view	add_to_cart	begin_checkout	purchase
1	yes	yes	yes	yes
2	yes	yes	yes
3	yes	yes
4	yes

Calculating conversion rates:

page_view: 4 users, 100% (starting point, always 1.0)
add_to_cart: 3 users, 75% of page_view users (3/4)
begin_checkout: 2 users, 67% of add_to_cart users (2/3)
purchase: 1 user, 50% of checkout users (1/2)

The test verifies both user counts and conversion rates. If the model calculates conversion rate incorrectly — dividing by total users (1/4 = 25%) instead of previous step users (1/2 = 50%) for purchase — this test will fail.

Why This Test Data Design Works

The key insight is that each user drops off at a different step, creating a predictable staircase pattern. This makes the math transparent and each conversion rate unique:

4, 3, 2, 1 users per step produces rates of 1.0, 0.75, 0.67, 0.5
These are all different numbers, so you can tell exactly which calculation is wrong if a test fails
If all conversion rates were 0.5, a bug that returns the right answer for the wrong reason would pass

Compare this to test data where 10 users complete step 1, 5 complete step 2, and 5 complete step 3. The rate from step 2 to step 3 is 1.0, which means a bug in the step-over-step calculation (like using step 1 as the denominator) might produce 0.5 — the same as the correct rate from step 1 to step 2. Distinctive values at each step make bugs more visible.

The Step-Skipping Edge Case

Real users don’t always follow the expected sequence. Someone might purchase without adding to cart (say, through a one-click buy button). How should your funnel handle this?

There are two valid approaches:

Strict funnel: Only count users who completed the previous step. A user who purchases without adding to cart doesn’t count at the purchase step.
Loose funnel: Count users who completed the step regardless of prior steps. A user who purchases without adding to cart counts at purchase.

Whichever you choose, write a test that documents it:

unit_tests:
  - name: test_funnel_step_skipping
    model: mrt__marketing__conversion_funnel
    description: "Users who skip steps should [be excluded / be counted]"
    given:
      - input: ref('base__ga4__events')
        rows:
          - {user_id: 1, event_name: "page_view", event_date: "2024-06-01"}
          - {user_id: 1, event_name: "purchase", event_date: "2024-06-01"}  # skipped cart + checkout
    expect:
      rows:
        - {funnel_step: "page_view", user_count: 1, step_conversion_rate: 1.0}
        - {funnel_step: "add_to_cart", user_count: 0, step_conversion_rate: 0.0}
        - {funnel_step: "begin_checkout", user_count: 0, step_conversion_rate: 0.0}
        - {funnel_step: "purchase", user_count: 0, step_conversion_rate: 0.0}  # strict funnel

In a strict funnel, User 1’s purchase doesn’t count because they never added to cart. In a loose funnel, the purchase step would show user_count: 1. The test makes the business rule explicit and prevents someone from accidentally changing the behavior during a refactor.

Practical Considerations

Rounding: Conversion rates often involve division that produces repeating decimals (2/3 = 0.666…). Your test needs to match whatever rounding your model applies. If the model rounds to two decimal places, the expected value should be 0.67. If it doesn’t round, you’ll need to handle precision in the comparison.

Multiple events per step: A user might trigger page_view five times. Your funnel should count them once at that step. If you’re using COUNT(DISTINCT user_id), this is handled automatically. If you’re using COUNT(*), the test should expose the over-counting.

Date ranges: Funnels are usually calculated per period (daily, weekly). Include a test where the same user completes funnel steps across different periods to verify the grouping logic.