ServicesAboutNotesContact Get in touch →
EN FR
Note

Salesforce Record Type Partitioning in dbt

How to handle Salesforce RecordTypeId in the warehouse — filtering by record type in base models, splitting objects into separate models, and storing IDs in dbt vars.

Planted
dbtbigquerydata modelingdata engineering

A single Salesforce object can behave like multiple different entities depending on its RecordTypeId. When record types represent distinct business processes — different stage sequences, validation rules, or metrics — modeling them as a single entity produces mixed results that are not useful for analysis by record type.

What Record Types Do in Salesforce

Record types let Salesforce admins define different “flavors” of the same object. The most common example: Opportunity record types like “New Business”, “Renewal”, and “Upsell.” Each record type can have:

  • Different picklist values — the stage names for new business opportunities might be Prospecting → Qualification → Proposal → Negotiation → Closed Won. Renewal stages might be Upcoming → In Review → Approved → Closed Won. Same field (StageName), completely different value sets.
  • Different validation rules — new business might require a discovery date, renewals might require a renewal term.
  • Different page layouts — sales reps see different fields depending on the record type.

From the Salesforce UI, these look like different objects. From the database, they’re all rows in the same Opportunity table, differentiated only by the RecordTypeId column.

When to Split

The decision depends on whether record types represent genuinely different business processes or minor variations of the same one. If stage names, cycle times, and key metrics differ by record type (e.g., New Business vs. Renewal), splitting is appropriate. If record types differ only in page layout or validation rules but share the same pipeline structure, a single model with a type column is sufficient.

Splitting by Record Type in Base Models

When record types represent different pipelines, create separate base models filtered by RecordTypeId:

-- base__salesforce__opportunity_new_business.sql
SELECT
id AS opportunity__id,
account_id AS opportunity__account_id,
owner_id AS opportunity__owner_id,
name AS opportunity__name,
stage_name AS opportunity__stage,
amount AS opportunity__amount,
close_date AS opportunity__close_at,
is_won AS opportunity__is_won,
is_closed AS opportunity__is_closed,
created_date AS opportunity__created_at
FROM {{ source('salesforce', 'opportunity') }}
WHERE NOT _fivetran_deleted
AND record_type_id = '{{ var("salesforce_new_business_record_type_id") }}'
-- base__salesforce__opportunity_renewal.sql
SELECT
id AS opportunity__id,
account_id AS opportunity__account_id,
owner_id AS opportunity__owner_id,
name AS opportunity__name,
stage_name AS opportunity__stage,
amount AS opportunity__amount,
close_date AS opportunity__close_at,
is_won AS opportunity__is_won,
is_closed AS opportunity__is_closed,
created_date AS opportunity__created_at
FROM {{ source('salesforce', 'opportunity') }}
WHERE NOT _fivetran_deleted
AND record_type_id = '{{ var("salesforce_renewal_record_type_id") }}'

Each model looks nearly identical, but the WHERE clause on record_type_id separates them. Downstream intermediate and mart models can then apply record-type-specific logic — different stage mappings, different metrics, different grain.

This is a judgment call about where to put the split. Some teams prefer filtering at the intermediate layer instead, keeping a single base model for all opportunities and branching later. Both work. The base-layer split is cleaner when the record types truly represent different entities — it keeps your DAG honest about what each branch represents.

Storing Record Type IDs in dbt Vars

Record type IDs are 18-character Salesforce IDs that your Salesforce admin assigns when creating record types. They’re stable — they don’t change unless someone deletes and recreates the record type — but they’re opaque. Nobody reading your SQL knows what 0125f000000abCDEFG means.

Store them in dbt_project.yml as vars:

dbt_project.yml
vars:
salesforce_new_business_record_type_id: "0125f000000abCDEFG"
salesforce_renewal_record_type_id: "0125f000000ghIJKLM"
salesforce_upsell_record_type_id: "0125f000000noPQRST"

This gives you:

  1. Documentation — the var name explains what each ID means.
  2. Single point of change — when your Salesforce admin creates a new record type, you update one file.
  3. Environment flexibility — sandbox and production Salesforce orgs may have different record type IDs. You can override vars per environment in your dbt profiles.

The Alternative: A Single Model with a Type Column

If your record types share the same pipeline structure and your stakeholders view them as subsets of the same entity rather than separate entities, a single model with a record_type column works fine:

-- base__salesforce__opportunity.sql
SELECT
id AS opportunity__id,
account_id AS opportunity__account_id,
record_type_id AS opportunity__record_type_id,
-- ... other fields
FROM {{ source('salesforce', 'opportunity') }}
WHERE NOT _fivetran_deleted

Then join to a RecordType reference table in an intermediate model to get human-readable names:

-- int__opportunity_with_record_type.sql
SELECT
opp.*,
rt.name AS opportunity__record_type_name
FROM {{ ref('base__salesforce__opportunity') }} AS opp
LEFT JOIN {{ ref('base__salesforce__record_type') }} AS rt
ON opp.opportunity__record_type_id = rt.record_type__id

This preserves a single opportunity lineage while still making the record type available for filtering in dashboards and mart models.

Record Types Beyond Opportunity

Opportunity is the most common object with record types, but any Salesforce object can have them. Common examples:

  • Account — “Customer” vs. “Partner” vs. “Vendor”
  • Case — “Support” vs. “Bug Report” vs. “Feature Request”
  • Lead — “Inbound” vs. “Outbound” vs. “Partner Referral”

The same pattern applies: if the record types represent fundamentally different business processes, split. If they’re variations of the same process, keep them together with a type column. The Fivetran dbt_salesforce package doesn’t split by record type by default — it gives you a single opportunity model. If your org needs the split, you’ll either customize the package or build your own base models around it.

How to Find Your Record Type IDs

Query the RecordType table in your warehouse (Fivetran and other connectors extract it automatically):

SELECT
id AS record_type__id,
sobject_type AS record_type__object,
name AS record_type__name,
is_active AS record_type__is_active
FROM {{ source('salesforce', 'record_type') }}
WHERE sobject_type = 'Opportunity'
ORDER BY name

This gives you the mapping between IDs and human-readable names. Run this query when setting up your project and whenever your Salesforce admin notifies you of changes.