dbt Repository Structure for Cloud Function Deployment

Deploying dbt via Cloud Functions requires a specific repository layout. Cloud Functions expects its entry point file (main.py) and requirements.txt at the root of the source directory. That conflicts with the typical dbt project layout, where dbt_project.yml and all the dbt folders live at the root.

The solution: nest the dbt project inside a subdirectory, and put the Cloud Function files at the repository root.

The Target Structure

repo-root/
├── dbt_transform/
│   ├── analyses/
│   ├── models/
│   ├── seeds/
│   ├── snapshots/
│   ├── tests/
│   ├── dbt_project.yml
│   ├── packages.yml
│   └── profiles.yml
├── main.py
└── requirements.txt

The name dbt_transform is arbitrary — call it whatever makes sense for your project. The important part is that all dbt files live inside it, and main.py sits at the root alongside requirements.txt.

If you’re starting from an existing dbt project (everything at the root), you’re doing a simple mkdir dbt_transform && mv of all the dbt files into it. Nothing inside the dbt project changes; only the directory it lives in changes.

The profiles.yml

The profiles.yml lives inside dbt_transform/, not in ~/.dbt/ as you might have it locally. This is intentional — Cloud Functions doesn’t have access to your home directory. The main.py explicitly sets DBT_PROFILES_DIR to point at the subdirectory.

Use method: oauth for authentication. This tells dbt-bigquery to use Application Default Credentials, which resolves to the Cloud Function’s attached service account at runtime:

dbt_project_name:
  outputs:
    dev:
      type: bigquery
      method: oauth
      project: gcp_project_name
      dataset: dbt
      location: EU
      threads: 4
      job_execution_timeout_seconds: 3500
      job_retries: 1
      priority: interactive
  target: dev

Don’t use method: service-account with a key file here — that would require a key file on disk, which you don’t want in a Cloud Function. The oauth method plus an attached service account is cleaner: no secrets in the repository, no key rotation, and it works the same way in both Cloud Functions and Cloud Run Jobs.

Also note that the target is dev here. This is fine for a Cloud Function that runs in production — the target name is just a label. What matters is that project and dataset point at the correct production resources. If you want a more explicit prod target, add it and change the target: field to prod.

main.py

The entry point does three things: sets the DBT_PROFILES_DIR environment variable, changes to the dbt project directory, and invokes dbt as a subprocess:

import os
import subprocess
import logging

logging.basicConfig(level=logging.INFO)

def run_dbt(request):
    try:
        os.environ['DBT_PROFILES_DIR'] = '/workspace/dbt_transform'
        dbt_project_dir = '/workspace/dbt_transform'
        os.chdir(dbt_project_dir)

        logging.info(f"Current working directory: {os.getcwd()}")
        logging.info(f"Files in the current directory: {os.listdir('.')}")

        # Install dbt packages
        logging.info("Installing dbt packages...")
        subprocess.run(['dbt', 'deps'], check=True, capture_output=True, text=True)

        # Run dbt
        result = subprocess.run(
            ['dbt', 'build'],
            capture_output=True,
            text=True
        )
        return result.stdout

    except subprocess.CalledProcessError as e:
        logging.error(f"Command '{e.cmd}' returned non-zero exit status {e.returncode}.")
        logging.error(f"stdout: {e.stdout}")
        logging.error(f"stderr: {e.stderr}")
        return f"Error running dbt: {e.stderr}"
    except Exception as e:
        logging.error(f"Error running dbt: {str(e)}")
        return f"Error running dbt: {str(e)}"

A few things worth noting:

/workspace/dbt_transform is the path where Cloud Functions mounts your source code. The /workspace/ prefix is a Cloud Functions 2nd gen convention. Don’t hardcode a path that works on your local machine here.

dbt deps runs at execution time, not at deploy time. Every invocation of the function reinstalls your dbt packages. This adds latency proportional to how many packages you have. If you’re using many packages and run time matters, Cloud Run Jobs with packages baked into the container image is worth considering.

dbt build runs your models, tests, seeds, and snapshots in dependency order. If you only want models, use dbt run. If you want to add selectors (e.g., only run specific tags or paths), extend the list: ['dbt', 'build', '--select', 'tag:daily'].

Error handling separates subprocess.CalledProcessError (dbt returned a non-zero exit code) from general exceptions. This makes log triage easier — you know whether dbt itself failed or whether something went wrong in the Python wrapper.

requirements.txt

Keep it minimal:

dbt-core
dbt-bigquery

For production use, pin to specific versions. A deployment that works today shouldn’t break because dbt-core released a new version with a breaking change overnight:

dbt-core==1.9.1
dbt-bigquery==1.9.0

Check dbt-bigquery releases to verify compatibility between dbt-core and dbt-bigquery versions. Mismatched versions are a common source of silent failures.

Why This Layout Works

The subdirectory pattern is a trade-off. It adds a nesting level that doesn’t exist in a pure dbt project, which means any dbt commands you run locally need to be run from inside dbt_transform/, or you need to pass --project-dir dbt_transform explicitly.

That’s a small annoyance worth accepting because the alternative — moving Cloud Function files into the dbt project root — clutters the dbt namespace and can confuse tools that expect standard dbt project structure.

If you later migrate to Cloud Run Jobs, the dbt_transform/ subdirectory stays. The Cloud Run Job just changes how and when it’s invoked — the dbt project itself is unchanged. This portability is worth designing for from the start.