HubSpot tracks property changes differently from Salesforce. Where Salesforce has a single FieldHistory mechanism and per-object history tables controlled through field history tracking settings, HubSpot maintains separate history tables per object type: CONTACT_PROPERTY_HISTORY, DEAL_PROPERTY_HISTORY, COMPANY_PROPERTY_HISTORY, and so on. Each row records a single property change: object ID, property name, old value, new value, timestamp.
Two behaviors cause problems in warehouse pipelines: retention limits and the CALCULATED property type.
Retention Limits
HubSpot’s property history isn’t unlimited audit logging. It keeps a fixed number of historical values per property per record:
- Contacts: up to 45 values per property
- Other objects (companies, deals, tickets): up to 20 values per property
Once a property exceeds its retention cap, older values drop off. If a deal changes stage 25 times, you can only see the most recent 20. If a contact’s email is updated 50 times (rare but possible in data cleanup scenarios), the first 5 are gone.
This has a direct implication for how you use history data. HubSpot property history can answer “how did this property change recently?” but it’s not a reliable audit trail for heavily-mutated records over long time horizons. If you need guaranteed point-in-time state, dbt snapshots on the main object tables are more reliable because you control the retention period.
Property history is most useful for:
- Stage transition analysis (how many deals moved from Proposal to Negotiation last quarter?)
- Recent lifecycle stage changes (who became an MQL in the last 30 days based on history, not just current state?)
- Property audit trails where you need a few months of history rather than years
When your analysis requires looking further back, or for properties that change frequently, work from the main object tables with snapshot history.
The CALCULATED Property Problem
CALCULATED is a property type in HubSpot for computed fields — values derived from formulas, rollup summaries, or system calculations. The problem: CALCULATED properties always update their cursor timestamps, even when the computed value hasn’t actually changed.
Why does this matter? Fivetran and Airbyte both use cursor-based incremental extraction for property history. The cursor is the last-modified timestamp. When a CALCULATED property updates its timestamp without changing its value, the connector picks it up as a changed record and re-syncs it. You get a row in CONTACT_PROPERTY_HISTORY with the same value as the previous row — just a fresh timestamp.
At scale, this inflates sync costs significantly. A HubSpot portal with 200k contacts and a handful of CALCULATED properties can generate millions of spurious history rows per sync cycle. These rows don’t represent real data changes; they’re phantom updates that burn through your monthly active rows (if you’re on Fivetran MAR pricing) and bloat your history tables in BigQuery.
How to detect this: Compare row counts in your history tables against the actual rate of change in your HubSpot portal. If CONTACT_PROPERTY_HISTORY is growing much faster than your contact count, audit which properties are CALCULATED. Pull the property list from the HubSpot API:
import hubspotfrom hubspot.crm.properties import ApiException
client = hubspot.Client.create(access_token="your_token")response = client.crm.properties.core_api.get_all( object_type="contacts")for prop in response.results: if prop.calculated: print(prop.name, prop.field_type)What to do about it: Two options. First, exclude CALCULATED properties from history syncing in your extraction tool configuration. Fivetran allows table-level exclusions; Airbyte lets you deselect streams. If you don’t need the history of computed fields, don’t pay to store it.
Second, if you need some CALCULATED property history for analysis, deduplicate at the base model layer:
-- base__hubspot__contact_property_history.sqlWITH
source AS ( SELECT contact_id, property_name, property_value AS history__value, CAST(timestamp AS TIMESTAMP) AS history__changed_at, -- Deduplicate: only keep rows where value actually changed LAG(property_value) OVER ( PARTITION BY contact_id, property_name ORDER BY timestamp ASC ) AS history__previous_value FROM {{ source('hubspot', 'contact_property_history') }})
SELECT contact_id, property_name, history__value, history__changed_atFROM sourceWHERE history__previous_value IS NULL -- first record for this property OR history__value != history__previous_value -- value actually changedThis removes consecutive identical values, leaving only genuine transitions. It won’t help with sync costs upstream, but it cleans up the data that reaches your mart layer.
Enabling History in the dbt_hubspot Package
The dbt_hubspot package treats property history as an optional module, disabled by default:
vars: hubspot_contact_property_history_enabled: true hubspot_deal_property_history_enabled: trueThe package provides history models with the duplicate-row problem unaddressed. If you enable history modules, apply the deduplication pattern above as a post-hook or by wrapping the package’s intermediate history models in your own layer.
Cost consideration: CONTACT_PROPERTY_HISTORY in particular can become your largest table in BigQuery, especially with CALCULATED properties enabled. Before enabling history in the package, check the row count in your raw history table and estimate monthly growth. For large portals, you may want to enable history for specific high-value properties only, rather than enabling it globally.
Using History for Stage Analysis
The legitimate use case for deal property history is stage transition analysis when you don’t have the dedicated DEAL_STAGE table from Fivetran (or aren’t using Fivetran). If you need to reconstruct how deals moved through stages:
-- int__deal_stage_transitions_from_history.sqlSELECT contact_id AS deal_id, history__value AS deal__to_stage, history__changed_at AS deal__stage_changed_at, LAG(history__value) OVER ( PARTITION BY contact_id ORDER BY history__changed_at ASC ) AS deal__from_stage, DATE_DIFF( history__changed_at, LAG(history__changed_at) OVER ( PARTITION BY contact_id ORDER BY history__changed_at ASC ), DAY ) AS deal__days_in_previous_stageFROM {{ ref('base__hubspot__deal_property_history') }}WHERE property_name = 'dealstage'If you’re using Fivetran, use the DEAL_STAGE table instead — it’s cleaner and purpose-built for this analysis. See HubSpot Deal Stage Modeling for that pattern.
For the full HubSpot pipeline context, including how property history fits into the ingestion tool selection decision, see the HubSpot to BigQuery guide.