This hub covers the Salesforce-specific layer of a BigQuery pipeline. The CRM Data Architecture Hub covers architecture patterns that apply to any CRM source (layered dbt models, SCD Type 2 tracking, incremental strategies, Fivetran packages). The notes here address Salesforce’s particular data model: polymorphic foreign keys, formula fields that change without updating timestamps, record types that split objects into different business processes, and account hierarchies that require recursive SQL.
Prerequisites
Before starting, you should be comfortable with:
- CRM Data Extraction Challenges — mutability, soft deletes, formula field blind spots
- CRM Modeling Patterns in dbt — base/intermediate/mart architecture for CRM data
- Salesforce vs HubSpot Data Models — how Salesforce structures its data model
Reading Order
-
Salesforce Ingestion Tool Selection — Choosing between Fivetran, Airbyte, dlt, Hevo, and custom Python for Salesforce extraction. Salesforce-specific connector mechanics, cost realities, and the AppExchange dispute.
-
Salesforce Polymorphic Relationship Resolution — How to resolve
WhoIdandWhatIdpolymorphic foreign keys using ID prefix routing. The SQL pattern and where it recurs across Tasks, Events, and custom objects. -
Salesforce Record Type Partitioning in dbt — Handling
RecordTypeIdin the warehouse. When to split objects into separate models by record type vs. keeping a single model with a type column. -
Salesforce Person Accounts and Multi-Currency in the Warehouse — Two Salesforce features that break standard assumptions: Person Accounts (merged Account/Contact) and multi-currency (exchange rate conversion in dbt).
-
Salesforce Opportunity Stage Duration Analysis — Calculating time in each pipeline stage using
OpportunityFieldHistoryand LEAD window functions. Bottleneck detection and win rate metrics. -
Salesforce Account Hierarchy with Recursive CTEs — Resolving self-referential
ParentAccountIdinto a flattened hierarchy for revenue rollup across corporate families. -
Salesforce Unified Activity Timeline — Combining Tasks and Events into a single chronological view with polymorphic entity resolution.
Existing Notes That Apply
These garden notes from the broader CRM architecture series apply directly to Salesforce work:
- Fivetran dbt Packages for CRM — pass-through columns for custom fields, history mode, multi-org support
- SCD Type 2 with dbt Snapshots — tracking historical state of CRM records
- CRM Data Extraction Challenges — formula field blind spots with SystemModStamp
Related
- HubSpot BigQuery Pipeline Hub — HubSpot-to-BigQuery pipeline, covering the association-based data model