ServicesAboutNotesContact Get in touch →
EN FR
Note

Salesforce to BigQuery Pipeline

Hub note for the Salesforce-to-BigQuery pipeline — from ingestion tool selection through polymorphic resolution, stage tracking, account hierarchies, and activity timelines.

Planted
dbtbigquerydata modelingdata engineeringetl

This hub covers the Salesforce-specific layer of a BigQuery pipeline. The CRM Data Architecture Hub covers architecture patterns that apply to any CRM source (layered dbt models, SCD Type 2 tracking, incremental strategies, Fivetran packages). The notes here address Salesforce’s particular data model: polymorphic foreign keys, formula fields that change without updating timestamps, record types that split objects into different business processes, and account hierarchies that require recursive SQL.

Prerequisites

Before starting, you should be comfortable with:

Reading Order

  1. Salesforce Ingestion Tool Selection — Choosing between Fivetran, Airbyte, dlt, Hevo, and custom Python for Salesforce extraction. Salesforce-specific connector mechanics, cost realities, and the AppExchange dispute.

  2. Salesforce Polymorphic Relationship Resolution — How to resolve WhoId and WhatId polymorphic foreign keys using ID prefix routing. The SQL pattern and where it recurs across Tasks, Events, and custom objects.

  3. Salesforce Record Type Partitioning in dbt — Handling RecordTypeId in the warehouse. When to split objects into separate models by record type vs. keeping a single model with a type column.

  4. Salesforce Person Accounts and Multi-Currency in the Warehouse — Two Salesforce features that break standard assumptions: Person Accounts (merged Account/Contact) and multi-currency (exchange rate conversion in dbt).

  5. Salesforce Opportunity Stage Duration Analysis — Calculating time in each pipeline stage using OpportunityFieldHistory and LEAD window functions. Bottleneck detection and win rate metrics.

  6. Salesforce Account Hierarchy with Recursive CTEs — Resolving self-referential ParentAccountId into a flattened hierarchy for revenue rollup across corporate families.

  7. Salesforce Unified Activity Timeline — Combining Tasks and Events into a single chronological view with polymorphic entity resolution.

Existing Notes That Apply

These garden notes from the broader CRM architecture series apply directly to Salesforce work:

  • HubSpot BigQuery Pipeline Hub — HubSpot-to-BigQuery pipeline, covering the association-based data model