ServicesAboutNotesContact Get in touch →
EN FR
Note

Build vs. Buy Data Pipelines

A reading path through the shifting economics of managed vs. custom data pipelines — from Fivetran's pricing changes through AI-assisted development with dlt to the hybrid strategy

Planted
dltbigquerydata engineeringetlcost optimization

This hub covers the shifting economics of managed vs. custom data pipelines. Four notes move from Fivetran’s pricing changes through AI-assisted development with dlt to the hybrid decision framework.

Reading Order

  1. Fivetran MAR Pricing Shift — The catalyst. Fivetran’s March 2025 shift to per-connector MAR pricing eliminated bulk discounts, producing 4-8x cost increases for many teams. Marketing data, with its constant retroactive updates, gets hit hardest.

  2. Build vs. Buy Data Pipeline Economics — The convergence. Three independent shifts — pricing unpredictability, measured AI development velocity (55.8% faster with Copilot), and dlt’s production maturity (3M monthly downloads) — compound to flip the traditional calculus.

  3. dlt for AI-Assisted Pipeline Development — The how. dlt’s Python-native, declarative design maps well to AI-assisted development. The REST API builder, BigQuery-specific features, and LLM-friendly documentation make the “build” option practical. Includes the workflow pattern and production results (Artsy: 98% improvement, 96% cost savings).

  4. Hybrid ELT Strategy — The decision framework. When buying still wins (compliance, non-technical teams, connector breadth, urgency), when building wins (high-MAR, custom, control), and the step-by-step migration path starting with your most expensive connector.

  • Advertising Data in the Warehouse — If your build-vs-buy question is specifically about ad platform data, this hub covers the full journey from extraction through transformation.
  • Ad Data Extraction Tools — A reference comparison of managed, open-source, and native integration options for ad data specifically.
  • AI Production Gap in Data Engineering — The limitations of AI-assisted development that affect the “build” option: security, compliance, and the gap between demo and production.
  • BigQuery Cost Model — Understanding BigQuery’s cost model helps optimize the pipelines you build, particularly around streaming vs. batch loading and partition pruning.