Cloud Functions as a dbt Execution Environment

Cloud Functions is a serverless compute option on GCP that runs code in response to an HTTP request or a Pub/Sub message. For dbt, it’s an entry point into scheduled orchestration that’s simpler to set up than Cloud Run Jobs — no Docker, no container registry, no artifact pipeline. You point it at a Python file and a requirements.txt, and GCP handles the rest.

The tradeoff is less control, less efficiency, and harder local reproduction. For teams already on GCP who want a scheduled dbt run without containerization, Cloud Functions is a viable starting point.

How It Works

A Cloud Function for dbt is a Python function that uses subprocess to invoke the dbt CLI:

import os
import subprocess
import logging

logging.basicConfig(level=logging.INFO)

def run_dbt(request):
    try:
        os.environ['DBT_PROFILES_DIR'] = '/workspace/dbt_transform'
        dbt_project_dir = '/workspace/dbt_transform'
        os.chdir(dbt_project_dir)

        logging.info(f"Current working directory: {os.getcwd()}")
        logging.info(f"Files in the current directory: {os.listdir('.')}")

        # Install dbt packages
        subprocess.run(['dbt', 'deps'], check=True, capture_output=True, text=True)

        # Run dbt
        result = subprocess.run(
            ['dbt', 'build'],
            capture_output=True,
            text=True
        )
        return result.stdout
    except subprocess.CalledProcessError as e:
        logging.error(f"Command '{e.cmd}' returned non-zero exit status {e.returncode}.")
        logging.error(f"stdout: {e.stdout}")
        logging.error(f"stderr: {e.stderr}")
        return f"Error running dbt: {e.stderr}"
    except Exception as e:
        logging.error(f"Error running dbt: {str(e)}")
        return f"Error running dbt: {str(e)}"

The function deploys with a requirements.txt that lists dbt-core and dbt-bigquery as dependencies. GCP installs these at deploy time. At runtime, dbt gets invoked as a subprocess within the function’s execution environment.

The deploy command looks like this:

gcloud functions deploy dbt_run \
  --region=europe-west1 \
  --service-account=dbt-transform-sa@projectid.iam.gserviceaccount.com \
  --gen2 \
  --runtime=python310 \
  --entry-point=run_dbt \
  --trigger-http \
  --timeout=3500 \
  --memory=1G

The --timeout=3500 flag (just under an hour) is the maximum for 2nd gen Cloud Functions. If your dbt project takes longer than that to run, Cloud Functions isn’t the right fit.

Cloud Functions vs. Cloud Run Jobs

The comparison matters because Cloud Run Jobs is the more commonly recommended approach. Both are serverless, both pay only for execution time, and both integrate natively with Cloud Scheduler and Eventarc. The meaningful differences are operational:

Setup complexity. Cloud Functions wins here. No Docker, no container registry, no image builds. Upload your Python file and requirements.txt, and GCP handles the runtime. Cloud Run Jobs requires building and pushing a container image before you can deploy — more steps, more tooling, more to debug.

Dependency installation timing. With Cloud Functions, dbt-core and dbt-bigquery install at deploy time but dbt deps (installing dbt packages) runs at execution time. This means every cold start installs your dbt packages before running models. On projects with many packages, this adds minutes to every run. Cloud Run Jobs bakes packages into the image at build time — execution starts immediately.

Reproducibility. requirements.txt pinned to specific versions gets you most of the way there, but the Cloud Function runtime environment is managed by GCP. A Cloud Run image you built last Tuesday is identical to the image you deploy today. Cloud Functions can have subtle differences between executions.

Timeout ceiling. Cloud Functions 2nd gen: up to 60 minutes. Cloud Run Jobs: up to 168 hours. For dbt projects with long-running models, the timeout ceiling on Cloud Functions is a real constraint.

Local testing. You can invoke a Cloud Run Job locally with docker run. There’s no equivalent for Cloud Functions without additional tooling (Functions Framework). Cloud Run Jobs win for local development workflows.

For most teams, Cloud Run Jobs is the better long-term foundation. But if you need dbt running on a schedule this week and you don’t have container infrastructure set up, Cloud Functions gets you there faster.

Authentication

Both options use the same underlying mechanism: an attached service account, OAuth method in profiles.yml, and Application Default Credentials at runtime.

In profiles.yml:

dbt_project_name:
  outputs:
    dev:
      type: bigquery
      method: oauth
      project: gcp_project_name
      dataset: dbt
      location: EU
      threads: 4
      job_execution_timeout_seconds: 3500
      job_retries: 1
      priority: interactive
  target: dev

The method: oauth tells dbt-bigquery to use ADC for credentials. When the function runs, the attached service account’s identity is automatically available through GCP’s metadata service — no key files, no GOOGLE_APPLICATION_CREDENTIALS variable needed in production. The service account needs the right BigQuery permissions on your source and transform projects.

When Cloud Functions Makes Sense

Cloud Functions is the right choice when:

You need scheduling fast, without learning container workflows or setting up Artifact Registry.
Your dbt project is small — fewer than 30-40 models, total runtime well under 30 minutes. The timeout ceiling and cold-start overhead become irrelevant.
Your team is already familiar with Cloud Functions from other use cases. Using a familiar tool reduces operational surface area.
This is temporary infrastructure while you evaluate more robust options. Cloud Functions is easy to spin up and easy to replace.

It’s the wrong choice when:

Your dbt project takes more than 45 minutes to run (approaching the timeout ceiling).
You need exact package versions pinned to an immutable artifact.
You want to test your production environment locally before deploying.
You’re already using Cloud Run for other workloads (marginal cost of adding Jobs is near zero).

For a new project with time to set it up, Cloud Run Jobs is the better long-term foundation. Cloud Functions is a reasonable first step when scheduling is needed immediately.