dlt (data load tool) is a Python library for building ELT pipelines. Pipelines are standard Python scripts — installed with pip, no containers or orchestration server required. The library handles pagination, schema inference, incremental state, and destination-specific loading.
These notes cover dlt’s core mechanics, BigQuery integration, and incremental loading behavior.
Reading Order
-
dlt Core Concepts — The four building blocks: sources, resources, pipelines, and schemas. Plus the three write dispositions (replace, append, merge) that control how data lands. Start here if you’re new to dlt.
-
dlt and BigQuery Integration — The BigQuery-specific layer: streaming inserts vs. GCS staging (and why staging almost always wins on cost),
bigquery_adapter()for partitioning and clustering, nested JSON normalization into parent-child tables, and the_dlt_metadata tables dlt creates. -
dlt Incremental Loading — How dlt tracks state between runs using
dlt.sources.incremental(). Cursor-based tracking, state stored in the destination, declarative REST API config, and how this relates to dbt incremental models downstream. -
dlt for AI-Assisted Pipeline Development — Why dlt’s Python-native, declarative design maps well to AI-assisted development. The REST API builder in practice, the AI + dlt workflow, and production results from teams who’ve made the switch.
Decision Context
The build-vs-buy decision framework covers when dlt is the right choice. dlt fits Python-proficient teams who want control, have budget constraints, or need sources without pre-built connectors. It is not suited to non-technical teams, organizations that need 700+ connectors, or teams without capacity to own pipeline infrastructure. See also Fivetran MAR Pricing Shift for the managed ELT pricing context.
Adjacent Reading
- Build vs. Buy Data Pipelines — The full economics argument for why the managed-vs-custom calculation shifted in 2025.
- BigQuery Cost Model — Understanding BigQuery’s cost model helps optimize the pipelines you build, especially around streaming vs. batch loading.
- Incremental Models in dbt — How incremental processing works in the transformation layer, complementing dlt’s extraction-layer incrementality.