Dagster+ offers two deployment modes on GCP. The choice depends on whether you want Dagster to manage compute entirely, or whether you need execution to stay within your own infrastructure.
Serverless Mode
Dagster hosts everything. Your code runs on Dagster’s infrastructure. You push code, Dagster handles deployment, scaling, and execution.
Best for: Workloads that orchestrate external services rather than running heavy compute. If your Dagster pipeline primarily tells BigQuery to run queries (via dbt), triggers Fivetran syncs, and writes metadata, the actual compute is minimal — BigQuery and Fivetran do the heavy lifting. Dagster Serverless handles the orchestration layer cheaply.
Limitations:
- Limited to 4 CPUs per node. For most dbt + BigQuery workflows this is fine, since the compute-intensive work happens in BigQuery, not in Dagster.
- Code and data transit through Dagster’s infrastructure. If your security requirements mandate that all execution stays within your VPC, Serverless doesn’t fit.
- Serverless compute costs ($0.005/minute) add to the credit-based pricing.
Setup: Minimal. Connect your Git repository, configure resources pointing at your GCP project, and Dagster handles the rest. No GKE cluster, no Helm chart, no infrastructure to manage.
For analytics engineering teams on dbt + BigQuery where the pipeline is primarily orchestration (scheduling dbt builds, coordinating Fivetran syncs, triggering downstream refreshes), Serverless is the simpler path to production deployment.
Hybrid Mode
Dagster hosts the control plane. Execution runs in your infrastructure. The Dagster+ control plane manages the web UI, scheduling, sensor evaluation, and run coordination. The actual computation — running dbt, executing Python assets, calling external APIs — happens on a Kubernetes agent in your GCP project.
Best for: Teams with security requirements (data stays in your VPC), heavy compute needs (more than 4 CPUs), or existing GKE infrastructure they want to reuse.
GKE Deployment
The standard Hybrid deployment runs a Dagster agent on Google Kubernetes Engine using Dagster’s official Helm chart:
helm repo add dagster https://dagster-io.github.io/helmhelm install dagster dagster/dagster \ --set dagsterCloud.deployment=prod \ --set dagsterCloud.agentToken=$DAGSTER_AGENT_TOKENThe Helm chart deploys:
- Agent pod that polls Dagster+ for runs to execute
- Worker pods that spin up for each run, execute your code, and shut down
- Configuration for resource limits, service accounts, and secrets
Authentication: Workload Identity
The recommended authentication pattern for GCP uses Workload Identity, which maps a Kubernetes service account to a GCP service account. Your Dagster agent pods authenticate to BigQuery, GCS, and other GCP services without service account keys.
# Helm values for Workload IdentityserviceAccount: create: true annotations: iam.gke.io/gcp-service-account: dagster-agent@my-project.iam.gserviceaccount.comOn the GCP side, bind the GCP service account to the Kubernetes service account:
gcloud iam service-accounts add-iam-policy-binding \ dagster-agent@my-project.iam.gserviceaccount.com \ --role roles/iam.workloadIdentityUser \ --member "serviceAccount:my-project.svc.id.goog[dagster/dagster-agent]"This follows the same ADC resolution pattern used in Cloud Run Jobs. The service account needs:
roles/bigquery.dataEditorandroles/bigquery.jobUserfor dbt executionroles/storage.objectViewerfor reading from GCS (if your pipeline uses GCS staging)- Any additional roles required by your Python assets
Storage: Cloud SQL + GCS
Dagster needs persistent storage for run history, event logs, and asset metadata. In Hybrid mode, you provide this:
- Cloud SQL PostgreSQL for the run storage and event log storage. A small Cloud SQL instance (db-f1-micro or db-g1-small) handles most workloads at $10-30/month.
- GCS bucket for I/O manager persistence — when assets pass data between steps, the serialized data lives in GCS.
# Helm values for Cloud SQLpostgresql: enabled: false # Don't deploy PostgreSQL in the clusterdagsterDaemon: env: DAGSTER_PG_HOST: /cloudsql/my-project:us-central1:dagster-db DAGSTER_PG_DB: dagster DAGSTER_PG_USER: dagster DAGSTER_PG_PASSWORD: secretKeyRef: name: dagster-db-credentials key: passwordFor Cloud SQL connectivity, use the Cloud SQL Auth Proxy as a sidecar container in the Dagster agent pod. This handles encrypted connections without exposing the database to the public internet.
Cloud Run Option (Community)
A community-maintained dagster-contrib-gcp package supports executing Dagster runs as Cloud Run jobs instead of GKE pods. This appeals to teams that prefer serverless compute and want to avoid managing a Kubernetes cluster.
The trade-offs versus GKE:
- Simpler infrastructure. No GKE cluster to manage.
- Cold start latency. Cloud Run jobs take seconds to spin up, which adds to execution time.
- Less control. Kubernetes offers fine-grained resource configuration, scheduling, and pod affinity that Cloud Run doesn’t support.
- Community-maintained. Not an official Dagster integration, so support and maintenance depend on community contributors.
For small teams running dbt + BigQuery workflows where the orchestration layer is lightweight, Cloud Run execution is a reasonable choice. For teams with heavy compute requirements or complex infrastructure needs, GKE is more robust.
Choosing a Mode
| Factor | Serverless | Hybrid (GKE) | Hybrid (Cloud Run) |
|---|---|---|---|
| Setup complexity | Minimal | High | Medium |
| Infrastructure management | None | GKE + Cloud SQL | Cloud SQL |
| Data residency | Dagster’s infra | Your GCP project | Your GCP project |
| Max compute per node | 4 CPUs | Configurable | Cloud Run limits |
| Monthly infra cost | Included in pricing | $50-200+ (GKE + Cloud SQL) | $10-50 (Cloud SQL) |
| Best for | Orchestration-light workloads | Enterprise, security-sensitive | Small teams, simple pipelines |
For GCP-native analytics engineering teams, the decision comes down to security requirements and infrastructure. If data can transit through Dagster’s infrastructure and compute needs are modest, Serverless is the simpler option. If data must stay in a VPC or GKE is already in use for other workloads, Hybrid on GKE is a natural fit.
The pricing note covers the cost implications of each mode, and the GCP orchestration framework positions Dagster relative to GCP-native alternatives.