ServicesAboutNotesContact Get in touch →
EN FR
Note

dlt Deployment Options

Where and how to run dlt pipelines in production — GitHub Actions, Airflow, Modal serverless, and other platforms — with the dlt deploy command as the starting point.

Planted
dltgcpdata engineeringetlautomation

dlt runs anywhere Python runs. A pipeline that works locally works in production with no code changes — only configuration changes. The dlt deploy command generates deployment scaffolding for common platforms so you don’t have to write that configuration from scratch.

The dlt deploy Command

dlt deploy takes your pipeline script and a target platform, and generates a deployment configuration:

Terminal window
dlt deploy my_pipeline.py github-action --schedule "0 6 * * *"
dlt deploy my_pipeline.py airflow-composer --secrets-format env

The generated files aren’t magic — they’re starting points you can modify. Review them before committing. They handle the boilerplate (dependency installation, secrets injection, scheduler configuration) so you focus on any platform-specific adjustments your setup needs.

GitHub Actions

GitHub Actions is the simplest deployment path for teams already using GitHub. Run on a schedule with secrets injected from repository secrets:

Terminal window
dlt deploy my_pipeline.py github-action --schedule "0 6 * * *"

This generates a workflow file in .github/workflows/. The generated workflow:

  • Checks out the repository
  • Installs Python and dlt dependencies
  • Configures secrets from GitHub repository secrets as environment variables
  • Runs the pipeline script on the specified cron schedule

Store credentials as GitHub repository secrets, not in the workflow file itself. See dlt Secrets Management for the environment variable naming conventions dlt expects.

GitHub Actions is a reasonable production choice for pipelines that run infrequently (hourly or less) and don’t need complex dependency management between pipelines. It’s free for public repositories and uses standard GitHub minutes for private ones.

Airflow / Google Cloud Composer

For teams with existing Airflow infrastructure, dlt integrates through PipelineTasksGroup:

Terminal window
dlt deploy my_pipeline.py airflow-composer --secrets-format env

The generated DAG uses PipelineTasksGroup to create separate Airflow tasks per dlt resource. This gives you resource-level parallelism and retry control within Airflow’s task graph — each resource becomes a task that can fail and retry independently rather than the entire pipeline succeeding or failing as a unit.

Google Cloud Composer is Google’s managed Airflow offering. If your organization runs other pipelines on Composer, adding dlt pipelines there keeps your orchestration centralized. The cost model is different from GitHub Actions — Composer is always-on infrastructure with hourly billing regardless of whether pipelines are running. See Cloud Composer Cost and Capabilities for when that cost is justified.

Modal is a good fit for serverless deployments where you don’t want to manage infrastructure:

import modal
import dlt
app = modal.App("my-dlt-pipeline")
@app.function(schedule=modal.Period(days=1))
def run_pipeline():
pipeline = dlt.pipeline(destination="bigquery", dataset_name="api_data")
pipeline.run(my_api_source())

Modal provisions compute on demand, runs the function, and tears it down. No persistent infrastructure. Billing is per-second of actual compute time, not per-hour of reserved capacity.

The trade-off: cold starts. Modal functions have startup latency when they haven’t run recently. For pipelines that run every hour, this is negligible. For pipelines that run every 5 minutes, cold start overhead may matter.

Modal’s secrets system integrates cleanly with dlt’s configuration hierarchy — store credentials in Modal secrets, access them as environment variables in the function.

Google Cloud Run Jobs

Cloud Run Jobs is the GCP-native serverless option. Unlike Modal (third-party), Cloud Run Jobs runs within your GCP project and integrates directly with IAM, Secret Manager, and Cloud Scheduler.

The deployment pattern is similar to Cloud Run Jobs for dbt — containerize the pipeline, push to Artifact Registry, create a Cloud Run Job, schedule with Cloud Scheduler. The difference is that your dlt pipeline script becomes the container entrypoint rather than a dbt command.

For GCP-native teams already using BigQuery as the destination, Cloud Run Jobs is often the most operationally natural choice: IAM controls access, Secret Manager holds credentials, Cloud Logging captures pipeline output, and Cloud Scheduler triggers runs — all within the same GCP project as your data.

Dagster

Dagster has native dlt integration through @dlt_assets. If your organization uses Dagster for orchestration, dlt pipelines become first-class Dagster assets:

from dagster_dlt import DagsterDltResource, dlt_assets
@dlt_assets(dlt_source=my_api_source(), dlt_pipeline=pipeline)
def my_api_assets(context, dlt: DagsterDltResource):
yield from dlt.run(context=context)

This gives you Dagster’s asset graph, lineage tracking, and run history for your dlt pipelines alongside your dbt assets. For teams using Dagster for dbt orchestration, adding dlt assets keeps the full extraction-to-transformation pipeline visible in one place.

Prefect

Prefect treats dlt pipelines as flows or tasks within flows. The integration is straightforward:

from prefect import flow
@flow
def my_pipeline_flow():
pipeline = dlt.pipeline(destination="bigquery", dataset_name="api_data")
pipeline.run(my_api_source())

Schedule the flow with Prefect’s deployment and scheduler infrastructure.

Choosing a Platform

The decision usually comes down to what your team already uses and what operational complexity you can accept:

PlatformBest for
GitHub ActionsSimple schedules, teams already on GitHub
Cloud Run JobsGCP-native teams with BigQuery destination
Airflow/ComposerExisting Airflow infrastructure
DagsterTeams orchestrating dbt + dlt together
ModalServerless-first teams without existing orchestration
PrefectExisting Prefect workflows

All of these options run the same Python code. The deployment platform is a configuration concern, not a code concern.