ServicesAboutNotesContact Get in touch →
EN FR
Note

Automating dbt Docs Deployment

Patterns for keeping dbt docs automatically updated — CI/CD workflows, Astronomer Cosmos operators, and tools that push documentation to platforms like Notion

Planted
dbtdata engineeringautomation

Manually running dbt docs generate and deploying the output produces stale documentation. Automating the deployment ensures the docs stay current as models change.

The core pattern

Regardless of your hosting platform, the automation pattern is the same:

  1. Trigger: merge to main (or a production dbt run completes)
  2. Generate: dbt docs generate (with --static if your host supports single-file serving)
  3. Deploy: push the output to your hosting platform

The trigger choice matters. Merging to main captures schema and documentation changes. Triggering after a production dbt run captures catalog changes (new columns, updated row counts). Ideally, you do both: generate on merge for fast documentation updates, and regenerate on a schedule to keep catalog metadata fresh.

CI/CD workflows

GitHub Actions

The simplest implementation for teams on GitHub:

name: Deploy dbt docs
on:
push:
branches: [main]
jobs:
deploy-docs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install dbt
run: pip install dbt-bigquery # or your adapter
- name: Generate docs
run: dbt docs generate --static
env:
DBT_PROFILES_DIR: .
- name: Deploy to Pages
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./target

For other hosting targets, replace the deploy step:

# Netlify
- name: Deploy to Netlify
run: npx netlify-cli deploy --prod --dir=target
env:
NETLIFY_AUTH_TOKEN: ${{ secrets.NETLIFY_AUTH_TOKEN }}
NETLIFY_SITE_ID: ${{ secrets.NETLIFY_SITE_ID }}
# GCS
- name: Deploy to GCS
run: gsutil -m rsync -r target/ gs://your-docs-bucket/

GitLab CI

deploy-docs:
stage: deploy
script:
- pip install dbt-bigquery
- dbt docs generate --static
artifacts:
paths:
- target/
pages:
stage: deploy
script:
- mv target public
artifacts:
paths:
- public

Astronomer Cosmos for Airflow

If your dbt project runs on Airflow, Astronomer Cosmos provides pre-built operators that generate and upload docs as a single Airflow task:

  • DbtDocsS3Operator — generates docs and uploads to S3
  • DbtDocsGCSOperator — generates docs and uploads to GCS
  • DbtDocsAzureStorageOperator — generates docs and uploads to Azure Blob Storage

These operators support the --static flag and handle the upload step for you. The typical pattern appends a docs generation task to the end of your dbt DAG:

from cosmos.operators.airflow_async import DbtDocsGCSOperator
generate_docs = DbtDocsGCSOperator(
task_id="generate_and_upload_docs",
project_dir="/path/to/dbt/project",
conn_id="gcp_conn",
bucket_name="your-docs-bucket",
static=True,
)
# Chain after your dbt run tasks
dbt_run >> generate_docs

This ensures docs are regenerated every time your production dbt pipeline runs, keeping catalog metadata (row counts, column types) fresh.

Pushing docs to non-technical platforms

For teams whose primary docs audience is non-technical, the standard dbt docs site may not be the right destination. dbt-docs-to-notion exports model documentation to a Notion database:

  • Each model gets its own Notion page
  • Descriptions, columns, and tests are structured as Notion properties
  • The export can run on a schedule to keep Notion in sync

This does not replace the full docs site for engineers who need the DAG and lineage views. But it makes model metadata accessible to people who live in Notion — product managers, analysts, business stakeholders who would never open a technical documentation site.

Scheduling considerations

How often should docs regenerate? The answer depends on what changes:

Change typeFrequencyWhat updates
Schema changes (new models, columns)On merge to mainmanifest.json
Description updatesOn merge to mainmanifest.json
Catalog metadata (row counts, types)After production runscatalog.json
Database statisticsAfter production runscatalog.json

If your production dbt runs happen daily, regenerating docs daily captures both schema and catalog changes. If you merge frequently but run production less often, triggering on merge gives faster documentation updates while a scheduled regeneration keeps catalog metadata current.

For most teams, triggering on merge to main is sufficient. The catalog metadata (row counts, exact column types) is less critical than having accurate model descriptions and up-to-date lineage. If freshness of catalog data matters to your team, add a scheduled regeneration after your production run completes.

The freshness signal

One pattern worth adopting: include a freshness indicator in your overview page. A build timestamp in the __overview__ doc block tells visitors how current the docs are:

{% docs __overview__ %}
# Data platform documentation
Last generated: {{ run_started_at.strftime('%Y-%m-%d %H:%M UTC') }}
...
{% enddocs %}

This uses Jinja, which is resolved at dbt docs generate time. Visitors can immediately see whether the docs reflect today’s state or last month’s. If the timestamp is old, that is a signal that your automation pipeline needs attention.