Automating dbt Docs Deployment

Manually running dbt docs generate and deploying the output produces stale documentation. Automating the deployment ensures the docs stay current as models change.

The core pattern

Regardless of your hosting platform, the automation pattern is the same:

Trigger: merge to main (or a production dbt run completes)
Generate: dbt docs generate (with --static if your host supports single-file serving)
Deploy: push the output to your hosting platform

The trigger choice matters. Merging to main captures schema and documentation changes. Triggering after a production dbt run captures catalog changes (new columns, updated row counts). Ideally, you do both: generate on merge for fast documentation updates, and regenerate on a schedule to keep catalog metadata fresh.

CI/CD workflows

GitHub Actions

The simplest implementation for teams on GitHub:

name: Deploy dbt docs
on:
  push:
    branches: [main]

jobs:
  deploy-docs:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install dbt
        run: pip install dbt-bigquery  # or your adapter
      - name: Generate docs
        run: dbt docs generate --static
        env:
          DBT_PROFILES_DIR: .
      - name: Deploy to Pages
        uses: peaceiris/actions-gh-pages@v3
        with:
          github_token: ${{ secrets.GITHUB_TOKEN }}
          publish_dir: ./target

For other hosting targets, replace the deploy step:

# Netlify
- name: Deploy to Netlify
  run: npx netlify-cli deploy --prod --dir=target
  env:
    NETLIFY_AUTH_TOKEN: ${{ secrets.NETLIFY_AUTH_TOKEN }}
    NETLIFY_SITE_ID: ${{ secrets.NETLIFY_SITE_ID }}

# GCS
- name: Deploy to GCS
  run: gsutil -m rsync -r target/ gs://your-docs-bucket/

GitLab CI

deploy-docs:
  stage: deploy
  script:
    - pip install dbt-bigquery
    - dbt docs generate --static
  artifacts:
    paths:
      - target/
  pages:
    stage: deploy
    script:
      - mv target public
    artifacts:
      paths:
        - public

Astronomer Cosmos for Airflow

If your dbt project runs on Airflow, Astronomer Cosmos provides pre-built operators that generate and upload docs as a single Airflow task:

DbtDocsS3Operator — generates docs and uploads to S3
DbtDocsGCSOperator — generates docs and uploads to GCS
DbtDocsAzureStorageOperator — generates docs and uploads to Azure Blob Storage

These operators support the --static flag and handle the upload step for you. The typical pattern appends a docs generation task to the end of your dbt DAG:

from cosmos.operators.airflow_async import DbtDocsGCSOperator

generate_docs = DbtDocsGCSOperator(
    task_id="generate_and_upload_docs",
    project_dir="/path/to/dbt/project",
    conn_id="gcp_conn",
    bucket_name="your-docs-bucket",
    static=True,
)

# Chain after your dbt run tasks
dbt_run >> generate_docs

This ensures docs are regenerated every time your production dbt pipeline runs, keeping catalog metadata (row counts, column types) fresh.

Pushing docs to non-technical platforms

For teams whose primary docs audience is non-technical, the standard dbt docs site may not be the right destination. dbt-docs-to-notion exports model documentation to a Notion database:

Each model gets its own Notion page
Descriptions, columns, and tests are structured as Notion properties
The export can run on a schedule to keep Notion in sync

This does not replace the full docs site for engineers who need the DAG and lineage views. But it makes model metadata accessible to people who live in Notion — product managers, analysts, business stakeholders who would never open a technical documentation site.

Scheduling considerations

How often should docs regenerate? The answer depends on what changes:

Change type	Frequency	What updates
Schema changes (new models, columns)	On merge to main	manifest.json
Description updates	On merge to main	manifest.json
Catalog metadata (row counts, types)	After production runs	catalog.json
Database statistics	After production runs	catalog.json

If your production dbt runs happen daily, regenerating docs daily captures both schema and catalog changes. If you merge frequently but run production less often, triggering on merge gives faster documentation updates while a scheduled regeneration keeps catalog metadata current.

For most teams, triggering on merge to main is sufficient. The catalog metadata (row counts, exact column types) is less critical than having accurate model descriptions and up-to-date lineage. If freshness of catalog data matters to your team, add a scheduled regeneration after your production run completes.

The freshness signal

One pattern worth adopting: include a freshness indicator in your overview page. A build timestamp in the __overview__ doc block tells visitors how current the docs are:

{% docs __overview__ %}
# Data platform documentation

Last generated: {{ run_started_at.strftime('%Y-%m-%d %H:%M UTC') }}

...
{% enddocs %}

This uses Jinja, which is resolved at dbt docs generate time. Visitors can immediately see whether the docs reflect today’s state or last month’s. If the timestamp is old, that is a signal that your automation pipeline needs attention.