Salesforce and HubSpot take fundamentally different approaches to data structure. These differences shape base models, join patterns, intermediate enrichment logic, and schema design.
Salesforce: Metadata-Driven, Relational
Salesforce uses a metadata-driven, multi-tenant architecture. Under the hood, all customer data lives in shared tables, with metadata defining the schema for each tenant (org). But from a data modeling perspective, what matters is that Salesforce data is inherently relational with well-defined foreign keys.
Standard Objects
Standard objects follow predictable schemas that are consistent across every Salesforce org:
- Account — Companies or organizations
- Contact — People associated with accounts
- Lead — Prospects not yet associated with accounts
- Opportunity — Deals in the pipeline
- Case — Support tickets
- Campaign — Marketing campaigns and their members
Each object maps cleanly to a raw table in your warehouse, then to a base model, then to one or more intermediate or mart models.
Custom Objects
Custom objects carry a __c suffix: Territory__c, Subscription__c, Product_Usage__c. Their fields also carry the suffix: annual_revenue__c, renewal_date__c. This suffix is preserved through extraction into your warehouse, so your base models need to handle renaming them to your project’s naming conventions.
-- base__salesforce__territory.sqlSELECT id AS territory__id, name AS territory__name, region__c AS territory__region, target_revenue__c AS territory__target_revenue, owner_id AS territory__owner_idFROM {{ source('salesforce', 'territory__c') }}WHERE NOT _fivetran_deletedRelationship Types
Salesforce relationships come in two flavors:
Lookup relationships create a loose coupling with a nullable foreign key. A Contact has a lookup to Account via account_id, but the Contact can exist without an Account. In your warehouse, this means LEFT JOINs — not every Contact will have matching Account data.
Master-Detail relationships create tight coupling with cascade deletes. If the master record is deleted, all detail records go with it. The foreign key is required (non-nullable). In your warehouse, this means INNER JOINs are safe — every detail record has a master.
The practical implication: when writing your intermediate models, check whether a Salesforce relationship is Lookup or Master-Detail. It determines whether you use LEFT JOIN or INNER JOIN, and whether you need to handle NULLs in the foreign key column.
Polymorphic Relationships
Salesforce has polymorphic fields like WhoId on Task and Event, which can reference either a Contact or a Lead. And WhatId, which can reference an Account, Opportunity, or custom object. These require CASE-based routing in your intermediate layer:
CASE WHEN who_id LIKE '003%' THEN 'Contact' WHEN who_id LIKE '00Q%' THEN 'Lead'END AS who_typeSalesforce ID prefixes are deterministic — 003 is always Contact, 00Q is always Lead, 001 is always Account, 006 is always Opportunity. This prefix-based routing is reliable but non-obvious if you haven’t worked with Salesforce before.
HubSpot: Association-Based, Many-to-Many
HubSpot takes a fundamentally different approach. Instead of foreign keys on the objects themselves, HubSpot uses a separate associations system to connect objects.
Core Objects
HubSpot’s core objects are simpler than Salesforce’s standard objects:
- Contacts — People
- Companies — Organizations
- Deals — Sales opportunities
- Tickets — Support requests
Properties are prefixed with property_ in the extracted data: property_email, property_firstname, property_lifecyclestage. Your base models rename these:
-- base__hubspot__contact.sqlSELECT id AS contact__id, property_email AS contact__email, property_firstname AS contact__first_name, property_lastname AS contact__last_name, property_lifecyclestage AS contact__lifecycle_stage, property_hs_lead_status AS contact__lead_status, property_createdate AS contact__created_atFROM {{ source('hubspot', 'contact') }}WHERE NOT _fivetran_deletedThe Association Model
This is where HubSpot diverges sharply from Salesforce. In Salesforce, a Contact belongs to one Account (via account_id). In HubSpot, a Contact can be associated with multiple Companies, and a Deal can involve multiple Contacts in different roles.
Associations live in separate bridge tables:
contact_company— Links contacts to companies (many-to-many)deal_contact— Links deals to contacts (many-to-many)deal_company— Links deals to companies (many-to-many)
Each association can carry an optional label like “Decision Maker”, “Billing Contact”, or “Primary Company”. These labels add semantic meaning to the relationship but complicate your join logic.
In practice, this means your intermediate models need explicit bridge table joins:
-- int__deal_contact_enriched.sqlWITH
deals AS ( SELECT deal__id, deal__name, deal__amount, deal__stage FROM {{ ref('base__hubspot__deal') }}),
contacts AS ( SELECT contact__id, contact__email, contact__first_name FROM {{ ref('base__hubspot__contact') }}),
associations AS ( SELECT deal_id AS deal__id, contact_id AS contact__id, label AS association__label FROM {{ ref('base__hubspot__deal_contact') }})
SELECT deals.deal__id, deals.deal__name, deals.deal__amount, deals.deal__stage, contacts.contact__id, contacts.contact__email, contacts.contact__first_name, associations.association__labelFROM dealsINNER JOIN associations ON deals.deal__id = associations.deal__idINNER JOIN contacts ON associations.contact__id = contacts.contact__idThe output is one row per deal-contact pair, not one row per deal. This is a different grain than what you’d get from a Salesforce opportunity model, and downstream logic needs to account for it.
Primary Association Pattern
When you need one company per contact (to match a Salesforce-like model), you have to pick a “primary” association. Common approaches:
- Filter on a specific label (e.g.,
WHERE association__label = 'Primary') - Take the earliest association by creation date
- Take the company with the most associated contacts (a proxy for “main” company)
None of these are perfect. HubSpot’s data model genuinely supports many-to-many, so reducing it to one-to-one always involves a business decision about which association takes precedence.
Modeling Implications
The structural differences have direct consequences for your dbt project:
| Concern | Salesforce | HubSpot |
|---|---|---|
| Join pattern | Direct foreign key joins | Bridge table joins |
| Grain management | Straightforward — FK preserves grain | Fan-out risk from many-to-many |
| Base model count | One per object | One per object + one per bridge table |
| Primary relationship | Built-in (FK on child) | Must be derived from associations |
| Custom fields | __c suffix, rename in base | property_ prefix, rename in base |
Salesforce modeling ends up being mostly about handling mutability, hierarchy (Account → Contact → Opportunity), and extraction quirks like formula fields. HubSpot modeling is about resolving many-to-many relationships correctly and deciding how to flatten associations for reporting.
Both systems benefit from the same three-layer dbt architecture, but the intermediate layer does very different work for each. For Salesforce, intermediate enriches entities with related data. For HubSpot, intermediate resolves association complexity into queryable joins.