ServicesAboutNotesContact Get in touch →
EN FR
Note

Fivetran dbt Packages for CRM

What dbt_salesforce and dbt_hubspot provide out of the box — model coverage, configuration, pass-through columns, history mode support, and naming convention tradeoffs.

Planted
dbtbigquerydata modelingdata engineering

The Fivetran dbt packages handle most common CRM modeling patterns_fivetran_deleted filtering, column renaming, join logic, history tracking — and are in use across thousands of installations. This note covers what dbt_salesforce and dbt_hubspot provide, their configuration options, and when building custom models is more appropriate.

dbt_salesforce

Package: fivetran/dbt_salesforce (v2.0.0) Model count: ~23 models Dependencies: fivetran/dbt_salesforce_source, fivetran/fivetran_utils

What You Get

The package produces four main mart models:

  • salesforce__opportunity_enhanced — Wide denormalized opportunity table with account, owner, and contact information pre-joined. This is the one big table pattern applied to CRM data, and it’s what most sales dashboards query.
  • salesforce__owner_performance — Aggregated metrics by opportunity owner: win rate, pipeline value, average deal size, average days to close.
  • salesforce__manager_performance — Same metrics rolled up to the manager level, using Salesforce’s user.manager_id hierarchy.
  • salesforce__sales_snapshot — Point-in-time pipeline summary by date, useful for tracking how pipeline has changed over time.

The source package (dbt_salesforce_source) handles the base layer — one staging model per Salesforce object with _fivetran_deleted filtering, column renaming, and type casting.

Pass-Through Columns

Salesforce orgs are heavily customized. The standard package models handle standard fields, but your org almost certainly has custom fields (__c) that matter for reporting. Pass-through columns let you bring these into the package’s output models without forking the package:

dbt_project.yml
vars:
salesforce__opportunity_pass_through_columns:
- name: custom_field__c
alias: opportunity__custom_field
- name: territory__c
alias: opportunity__territory
transform_sql: UPPER(territory__c)

The transform_sql option is particularly useful — you can rename, cast, or transform the custom field as it passes through. This avoids needing a separate intermediate model just to clean up a custom field.

Pass-through columns are available for opportunities, accounts, contacts, and users. Configure them for every custom field your stakeholders need in reporting.

History Mode Support

The package integrates with Fivetran History Mode for tracking changes over time:

dbt_project.yml
vars:
salesforce__account_history_enabled: true
salesforce__opportunity_history_enabled: true
salesforce__contact_history_enabled: true
global_history_start_date: '2024-01-01'

When enabled, the package produces daily history models that show the state of each record on each day. The global_history_start_date limits how far back the history goes, controlling storage costs.

This is an alternative to running your own dbt snapshots. The advantage is tighter integration with Fivetran’s sync-time change tracking. The disadvantage is that it’s Fivetran-specific and doesn’t capture formula field changes.

Multi-Org Support

If your company uses multiple Salesforce orgs (common with acquisitions or regional divisions), the package supports a source_relation column that identifies which org each record came from:

vars:
salesforce_sources:
- database: raw_salesforce_us
schema: salesforce
- database: raw_salesforce_eu
schema: salesforce

Each model includes a source_relation column, and mart models handle cross-org deduplication where applicable.

dbt_hubspot

Package: fivetran/dbt_hubspot (v1.6.1) Model count: Up to 147 models (all components enabled) Dependencies: fivetran/dbt_hubspot_source, fivetran/fivetran_utils

Configuration Is Critical

HubSpot’s package is much larger than Salesforce’s because HubSpot covers sales, marketing, and service — three distinct product areas with their own data models. Enable only what you use:

dbt_project.yml
vars:
hubspot_sales_enabled: true
hubspot_marketing_enabled: false
hubspot_service_enabled: false

If you only use HubSpot for sales CRM, disabling marketing and service removes ~100 models from your DAG. This isn’t just about build time — fewer models mean a simpler DAG, fewer potential breakpoints, and easier debugging.

Sales Models

When hubspot_sales_enabled: true, you get:

  • Deal pipeline models — Deals enriched with stage, owner, and company information
  • Contact models — Contacts with lifecycle stage tracking and company associations
  • Company models — Companies with aggregated deal and contact metrics
  • Engagement models — Calls, emails, meetings, notes, and tasks linked to their parent objects

The package handles HubSpot’s many-to-many association complexity internally. Bridge table joins, primary association logic, and label handling are built in. This alone saves significant development effort.

Marketing Models (When Enabled)

Marketing models cover email events, forms, campaigns, and UTM tracking. If you use HubSpot Marketing Hub, these provide ready-made models for email performance, form submission tracking, and campaign attribution.

Be aware that marketing data tends to be high-volume. Email events (opens, clicks, bounces) can produce millions of rows. If you enable marketing models, configure incremental materialization for the event tables.

Multi-Portal Support

Like the Salesforce package, HubSpot supports multiple portals through source_relation:

vars:
hubspot_sources:
- database: raw_hubspot_us
schema: hubspot
- database: raw_hubspot_eu
schema: hubspot

Using Both Packages Together

Many organizations use Salesforce and HubSpot simultaneously — Salesforce for sales CRM, HubSpot for marketing automation. Running both packages in the same dbt project works, and the source layers don’t conflict because they reference different source schemas.

The challenge is at the mart layer: you may want unified models that combine Salesforce opportunities with HubSpot marketing engagement. The packages don’t provide this — you build it yourself by creating mart models that reference both packages’ outputs:

-- mrt__sales__opportunity_with_marketing.sql
WITH
opportunities AS (
SELECT *
FROM {{ ref('salesforce__opportunity_enhanced') }}
),
marketing_engagement AS (
SELECT
contact_id,
COUNT(*) AS marketing_touches
FROM {{ ref('hubspot__email_events') }}
WHERE event_type = 'CLICK'
GROUP BY 1
)
SELECT
opportunities.*,
COALESCE(marketing_engagement.marketing_touches, 0)
AS opportunity__marketing_touches
FROM opportunities
LEFT JOIN marketing_engagement
ON opportunities.contact_id = marketing_engagement.contact_id

This cross-CRM join requires a shared key — typically email address or an external ID maintained in both systems. The join quality depends entirely on data hygiene in both platforms.

Naming Convention Tradeoffs

Both packages follow Fivetran’s own naming conventions: stg_salesforce__opportunity, salesforce__opportunity_enhanced. If your project uses different conventions (like the double-underscore pattern with base__ prefixes), you have two options:

Accept the inconsistency. Within the package namespace, models use Fivetran’s naming. Your own models use your naming. This is pragmatic and what most teams do. The package models live in their own schema, so the naming difference is contained.

Wrap the outputs. Create your own mart models that reference the package outputs and follow your naming:

-- mrt__sales__opportunity_enhanced.sql
SELECT
opportunity_id AS opportunity__id,
opportunity_name AS opportunity__name,
-- ... rename all columns to your convention
FROM {{ ref('salesforce__opportunity_enhanced') }}

This adds models to your DAG but gives you naming consistency. The wrapper models also serve as a boundary — if you ever swap out the Fivetran package for custom models, only the wrappers need to change, not every downstream reference.

When to Use Packages vs Build Custom

Use the packages when:

  • Your CRM setup is fairly standard (standard objects, typical workflows)
  • You want to get value quickly without building from scratch
  • The package’s output models cover your reporting needs
  • You have pass-through columns to handle custom fields

Build custom when:

  • You have complex custom objects that the package doesn’t cover
  • Your naming conventions are non-negotiable and you don’t want wrapper models
  • You need specific modeling patterns the package doesn’t support (complex account hierarchies, multi-currency, specific attribution logic)
  • The package’s 23+ models (Salesforce) or 147 models (HubSpot) add too much overhead for your use case

The middle path — using the source package for clean base models but building your own intermediate and mart layers — is often the best tradeoff. You get reliable extraction handling without committing to the package’s mart-level opinions.