HubSpot’s lifecycle stage model is unusually friendly to warehouse analytics. Most CRM stage tracking requires either a dedicated history table or snapshot-based reconstruction to know when a contact transitioned between stages. HubSpot does something simpler: every contact gets a dedicated timestamp property for each stage they pass through.
The standard progression is: Subscriber → Lead → Marketing Qualified Lead (MQL) → Sales Qualified Lead (SQL) → Opportunity → Customer → Evangelist. When a contact enters each stage, HubSpot stamps a “Became a [Stage] Date” property on the contact record itself:
property_became_a_lead_dateproperty_became_a_marketing_qualified_lead_dateproperty_became_a_sales_qualified_lead_dateproperty_became_a_customer_date
These are first-touch timestamps — they record when the contact first reached that stage, and they don’t get overwritten if the contact is somehow reassigned. This makes them reliable anchors for funnel analysis.
The Forward-Only Constraint
By default, HubSpot prevents lifecycle stage regression. Once a contact is marked as a Customer, HubSpot won’t let you move them back to Lead through normal UI actions. This is a feature for data integrity, not a bug.
The practical result: if a contact has a became_customer_at timestamp, it means they genuinely reached customer status at that point. You don’t need to worry about a stage reversal invalidating the timestamp. The date means what it says.
The exception is custom lifecycle stage configurations where admins have disabled this constraint. If your HubSpot portal has non-standard stage settings, verify the forward-only rule is enforced before treating these timestamps as reliable.
Extracting Lifecycle Timestamps in the Base Model
Map the lifecycle stage properties explicitly in your base model:
-- base__hubspot__contact.sqlSELECT id AS contact_id, property_email AS contact__email, property_lifecyclestage AS contact__lifecycle_stage, property_hs_lead_status AS contact__lead_status, CAST(property_createdate AS TIMESTAMP) AS contact__created_at, CAST(property_became_a_lead_date AS TIMESTAMP) AS contact__became_lead_at, CAST(property_became_a_marketing_qualified_lead_date AS TIMESTAMP) AS contact__became_mql_at, CAST(property_became_a_sales_qualified_lead_date AS TIMESTAMP) AS contact__became_sql_at, CAST(property_became_a_customer_date AS TIMESTAMP) AS contact__became_customer_at, _fivetran_deleted AS contact__is_deletedFROM {{ source('hubspot', 'contact') }}WHERE NOT COALESCE(_fivetran_deleted, FALSE)This is enough for most funnel analysis. The timestamps are on the contact record, not buried in a history table. No joins required for basic lifecycle metrics.
If you’re using the dbt_hubspot package, configure these as pass-through columns on the package’s contacts model to make them available in the package’s output:
vars: hubspot__contact_pass_through_columns: - name: property_became_a_lead_date alias: contact__became_lead_at - name: property_became_a_marketing_qualified_lead_date alias: contact__became_mql_at - name: property_became_a_sales_qualified_lead_date alias: contact__became_sql_at - name: property_became_a_customer_date alias: contact__became_customer_atBuilding the Lifecycle Funnel Mart
With lifecycle timestamps on the contact record, the funnel model is a single aggregation:
-- mrt__marketing__lifecycle_funnel.sqlSELECT COUNT(contact_id) AS funnel__leads, COUNTIF(contact__became_mql_at IS NOT NULL) AS funnel__mqls, COUNTIF(contact__became_sql_at IS NOT NULL) AS funnel__sqls, COUNTIF(contact__became_customer_at IS NOT NULL) AS funnel__customers, SAFE_DIVIDE( COUNTIF(contact__became_mql_at IS NOT NULL), COUNT(contact_id) ) AS funnel__lead_to_mql_rate, SAFE_DIVIDE( COUNTIF(contact__became_sql_at IS NOT NULL), COUNTIF(contact__became_mql_at IS NOT NULL) ) AS funnel__mql_to_sql_rate, SAFE_DIVIDE( COUNTIF(contact__became_customer_at IS NOT NULL), COUNTIF(contact__became_sql_at IS NOT NULL) ) AS funnel__sql_to_customer_rateFROM {{ ref('hubspot__contacts') }}WHERE contact__became_lead_at IS NOT NULLAdd a time dimension to track how conversion rates shift month over month:
SELECT DATE_TRUNC(contact__became_lead_at, MONTH) AS cohort_month, COUNT(contact_id) AS funnel__leads, COUNTIF(contact__became_mql_at IS NOT NULL) AS funnel__mqls, SAFE_DIVIDE( COUNTIF(contact__became_mql_at IS NOT NULL), COUNT(contact_id) ) AS funnel__lead_to_mql_rateFROM {{ ref('hubspot__contacts') }}WHERE contact__became_lead_at IS NOT NULLGROUP BY 1ORDER BY 1This groups contacts by the month they first became a lead and shows conversion rates for each cohort — useful for tracking whether lead quality is improving or degrading over time.
The Merged Contact Problem
Merged contacts are the main source of lifecycle data anomalies. When HubSpot merges two contact records, the surviving contact inherits properties from both. If the merged-away contact had a became_a_customer_date from a year ago and the surviving contact’s became_a_lead_date is more recent, you end up with a contact that “became a customer” before they “became a lead.”
These sequences are chronologically impossible in real lifecycle progression. They’re almost always merge artifacts.
How to detect them in your base model or a data quality check:
-- dbt test or quality model to flag merge artifactsSELECT contact_id, contact__became_lead_at, contact__became_mql_at, contact__became_sql_at, contact__became_customer_atFROM {{ ref('base__hubspot__contact') }}WHERE -- Customer before lead (contact__became_customer_at < contact__became_lead_at) -- MQL before lead OR (contact__became_mql_at < contact__became_lead_at) -- SQL before MQL OR (contact__became_sql_at < contact__became_mql_at) -- Customer before SQL OR (contact__became_customer_at < contact__became_sql_at)What to do with flagged records depends on your analysis goals. For funnel volume counts, you may want to exclude them entirely. For revenue attribution, you may want to keep them but exclude them from stage duration calculations (days from lead to MQL is meaningless if the dates are inverted). Document whatever decision you make — these records will come up in QA conversations.
The merge artifact issue isn’t unique to HubSpot, but the lifecycle timestamp model surfaces it clearly. It’s actually useful signal: if you’re seeing a lot of inverted sequences, it suggests active contact merging activity in HubSpot that may be worth investigating from a data hygiene perspective.
Velocity Metrics
The lifecycle timestamps enable velocity calculations that are genuinely useful for sales and marketing:
SELECT contact_id, DATE_DIFF(contact__became_mql_at, contact__became_lead_at, DAY) AS contact__days_lead_to_mql, DATE_DIFF(contact__became_sql_at, contact__became_mql_at, DAY) AS contact__days_mql_to_sql, DATE_DIFF(contact__became_customer_at, contact__became_sql_at, DAY) AS contact__days_sql_to_customerFROM {{ ref('hubspot__contacts') }}WHERE contact__became_customer_at IS NOT NULL -- Exclude merge artifacts AND contact__became_customer_at > contact__became_lead_at AND contact__became_sql_at > contact__became_mql_atThese velocity metrics belong in a contact 360 mart or as standalone metrics. Segmenting by acquisition channel, lead source, or deal owner shows differences in stage velocity across segments.
For how lifecycle stage data connects to deal tracking, see HubSpot Deal Stage Modeling. For the full pipeline context, see the HubSpot to BigQuery guide.