HubSpot’s data model does not place object relationships on the objects themselves. There is no company_id column on the contact table, no foreign key pointing from a deal to its associated contacts. Every relationship — contact to company, contact to deal, company to deal, deal to ticket — flows through a dedicated bridge table.
This differs from Salesforce, where a Contact has an AccountId field pointing to its Account. In HubSpot, joining contacts to companies requires passing through contact_company. Without this, multi-company contacts are missed entirely and join counts are wrong.
The Bridge Table Inventory
Fivetran’s HubSpot connector surfaces these association tables, among others:
contact_company— contacts to companies (many-to-many)contact_deal— contacts to deals (many-to-many)deal_company— deals to companies (many-to-many)deal_ticket— deals to support tickets (many-to-many)engagement_contact— engagement records to contacts
Each table has at minimum: the two object IDs, an optional label column, and metadata like _fivetran_synced. The label column is where things get interesting.
Association Labels
On HubSpot Pro and Enterprise, associations can carry a label: “Decision Maker,” “Billing Contact,” “Primary Company,” “Champion.” These labels are defined in HubSpot’s settings and attached at the relationship level, not the contact level. A contact can be a “Decision Maker” on one deal and a “Billing Contact” on another.
Labels are worth extracting. They encode buying committee structure and contact roles in a way that would otherwise require custom properties. A contact_deal association where label = 'Decision Maker' is analytically meaningful — these contacts matter more for deal velocity analysis than passive participants.
Your base model for association tables should preserve the label:
-- base__hubspot__contact_company.sqlSELECT contact_id, company_id, type_id AS association__type_id, label AS association__label, _fivetran_synced AS association__synced_atFROM {{ source('hubspot', 'contact_company') }}No soft-delete filter needed here — these tables track relationships, not records. A deleted contact’s associations disappear when the contact record is removed.
The Fan-Out Problem
When you join through association tables, your row count multiplies. A contact associated with three companies appears three times in a naive join against contact_company. This is correct behavior for the association grain, but it’s almost never what a mart model wants.
The symptom: you join contacts to companies and run COUNT(DISTINCT contact_id) to get contact counts per company. The results look off. Companies with lots of shared contacts report inflated numbers because you’re counting at the wrong grain.
The solution depends on the question you’re answering.
If you need one row per contact with some company context, pick a primary association:
-- int__contact_primary_company.sqlSELECT contact_id, company_id, association__labelFROM {{ ref('base__hubspot__contact_company') }}QUALIFY ROW_NUMBER() OVER ( PARTITION BY contact_id ORDER BY association__synced_at ASC) = 1This takes the earliest-synced company as the “primary” — a reasonable proxy for the original company relationship. If your team has established a label convention like “Primary Company,” filter on that instead:
QUALIFY ROW_NUMBER() OVER ( PARTITION BY contact_id ORDER BY CASE WHEN association__label = 'Primary Company' THEN 0 ELSE 1 END ASC, association__synced_at ASC) = 1If you need aggregate metrics about each company, aggregate from the bridge rather than joining:
-- int__company_contact_metrics.sqlSELECT company_id, COUNT(DISTINCT contact_id) AS company__contacts, COUNT(DISTINCT CASE WHEN association__label = 'Decision Maker' THEN contact_id END) AS company__decision_makersFROM {{ ref('base__hubspot__contact_company') }}GROUP BY company_idThis stays at the company grain and handles multi-company contacts correctly. The intermediate model exposes these metrics to mart models that need them.
Building One Intermediate Model Per Association Type
Don’t try to handle all association types in one model. Build a focused intermediate model for each pairing:
int__contact_company_mapped.sql— contact-to-company with primary resolutionint__contact_deal_mapped.sql— contact-to-deal pairs with labelsint__company_deal_mapped.sql— company-to-deal relationships
These intermediate models become the canonical join path for their association type. When your mart models need to connect contacts to deals, they reference int__contact_deal_mapped, not the raw bridge table. This keeps the fan-out logic in one place.
-- int__contact_deal_mapped.sqlSELECT contact_id, deal_id, association__label, association__synced_atFROM {{ ref('base__hubspot__contact_deal') }}The mart then joins on this:
-- mrt__sales__deal_contacts.sqlSELECT d.deal_id, d.deal__name, d.deal__amount, c.contact_id, c.contact__email, cdm.association__labelFROM {{ ref('hubspot__deals') }} AS dINNER JOIN {{ ref('int__contact_deal_mapped') }} AS cdm ON d.deal_id = cdm.deal_idLEFT JOIN {{ ref('hubspot__contacts') }} AS c ON cdm.contact_id = c.contact_idThe grain here is one row per deal-contact pair, which is the right grain for a deal-contacts model. Any aggregation happens downstream or in a separate mart.
The Primary Company Question Has No Perfect Answer
HubSpot’s association model genuinely supports many-to-many relationships, so any strategy to reduce it to one-to-one involves a business decision. The three approaches you’ll see in practice:
Earliest association wins. The first company a contact was linked to is their “primary.” Simple, deterministic, but may not reflect reality if a contact switches companies.
Label-based selection. If your team uses a “Primary Company” label consistently, use that. This requires enforcement of the labeling convention in HubSpot itself — if sales reps don’t label, this breaks.
Most recent sync. Take the most recently synced association. This tends to reflect the “current” company relationship but is volatile if the sync timestamp isn’t meaningful.
Whatever you choose, document it as a business rule in your intermediate model. Other mart models will build on this assumption, and the choice should be explicit and traceable.
For deeper context on how this differs from Salesforce’s foreign-key model, see Salesforce vs HubSpot Data Models. For the full HubSpot pipeline, see the HubSpot to BigQuery guide.