ServicesAboutNotesContact Get in touch →
EN FR
Note

dbt Documentation Rollout Strategy

A practical week-by-week approach to rolling out dbt documentation standards — starting with model descriptions, adding enforcement incrementally, and using AI tools to close coverage gaps

Planted
dbtdata qualityautomation

A week-by-week rollout for dbt documentation standards: gradual enforcement over two to three months, starting with model descriptions and adding layers as each becomes embedded. A big-bang sprint — documenting an entire project at once — tends to revert within months because no enforcement was established.

Week One: Model Descriptions Only

Write a one-page style guide covering model descriptions only. Don’t touch column descriptions yet. Define what the four questions are — business purpose, grain, exclusions, source — and share two or three examples showing what the template looks like in practice:

# Template
- name: mrt__<domain>__<concept>
description: >
[What business concept this represents, in business terms].
One row per [grain]. Excludes [any filters or business rules].
Sources: [source systems and upstream models].
# Example
- name: mrt__marketing__customer_ltv
description: >
Customer lifetime value using a 3-year rolling window and
discounted cash flow methodology (10% annual discount rate).
One row per customer. Excludes customers with no completed orders.
Sources: Shopify orders via base__shopify__orders.

Keep the style guide short enough to read in five minutes. Put it in CONTRIBUTING.md or dbt-styleguide.md in your project root. If you’re on dbt Cloud, use dbt-styleguide.md — the Copilot reads it automatically when generating documentation.

Don’t add any enforcement yet. This week is about establishing shared vocabulary and giving the team examples to work from.

Week Two: Pre-Commit Enforcement for Model Descriptions

Add check-model-has-description from dbt-checkpoint as a pre-commit hook:

.pre-commit-config.yaml
repos:
- repo: https://github.com/dbt-checkpoint/dbt-checkpoint
rev: v2.0.6
hooks:
- id: check-model-has-description

This means every commit that adds or modifies a model must include a model description. Developers who haven’t added one get an immediate failure before the commit goes through, not twenty minutes later in CI.

Simultaneously, run dbt-codegen to scaffold YAML stubs for existing models that don’t have descriptions:

Terminal window
dbt run-operation generate_model_yaml --args '{"model_names": ["model_name"], "upstream_descriptions": true}'

The upstream_descriptions: true flag pulls in descriptions from parent models, so you only need to write genuinely new descriptions. For models that inherit all their columns from a single upstream source, this can auto-populate much of the YAML. Fill in the gaps over the following two or three sprints — don’t try to do it all at once.

Month Two: Column Descriptions for Mart Models

Extend the requirements to column descriptions, but only for mart models. Not for base or intermediate models yet — that comes later if you want it at all.

Update the pre-commit config to add graduated enforcement with the files: regex:

hooks:
- id: check-model-has-description
- id: check-model-has-all-columns
files: ^models/marts
- id: check-model-columns-have-desc
files: ^models/marts
- id: check-column-desc-are-same

The files: regex scopes check-model-columns-have-desc to the models/marts directory only. This means base and intermediate models can still be committed without column descriptions, while mart models — the ones business users actually query — require full column documentation.

The check-column-desc-are-same hook catches a subtle inconsistency: the same column described differently in two mart models. This is worth having project-wide, not just for marts — it prevents the situation where customer_id has five slightly different descriptions across your project.

Update your style guide to include the column description patterns: specify units for numeric columns, include timezone for timestamps, state key relationships for ID columns, enumerate valid values for status fields.

Ongoing: AI-Assisted Coverage and Tracking

Once enforcement is working for mart model descriptions and columns, the remaining coverage gaps are best closed with AI assistance rather than manual effort. Use dbt Copilot, Claude Code with the dbt MCP server, or Altimate AI to generate first drafts, then refine with human knowledge for models where business context matters most.

Track coverage with dbt-coverage or dbt-project-evaluator and surface the numbers in PR comments:

Terminal window
dbt-coverage compute doc \
--manifest target/manifest.json \
--catalog target/catalog.json

Coverage numbers in PR comments make documentation state visible during review. The trend matters more than the absolute number — coverage moving from 65% to 70% to 75% indicates a working process; coverage stuck at 65% for three months indicates a process gap.

What Not to Do

A few rollout antipatterns worth avoiding:

Don’t require 100% from day one. Starting at a threshold you can’t currently meet creates either a compliance crisis or a standard everyone ignores. Start at whatever your current coverage actually is (even 30%), get enforcement working, then ratchet up.

Don’t enforce column descriptions everywhere. Base model columns that map directly to source fields don’t need descriptions in most cases — the column name is the description. Reserve the effort for mart models where business users browse the docs.

Don’t let enforcement slip for “just this once.” The --no-verify flag on git commits is a trap. Every exception normalizes skipping documentation. The discipline compounds in both directions: enforce it consistently and it becomes habit; skip it once and it becomes optional.

Don’t treat the style guide as permanent. Your standards will evolve as you learn what’s useful and what’s overhead. Update the style guide when the team discovers a better pattern. The important thing is that the style guide reflects current practice, not aspirational practice that nobody actually follows.

Realistic Timeline

WhenWhat
Week 1Write style guide, share examples, no enforcement
Week 2Pre-commit hook for model descriptions; scaffold existing models with dbt-codegen
Sprint 2-4Fill in model descriptions on existing models, one directory at a time
Month 2Extend enforcement to column descriptions for mart models
Month 3+AI-assisted first drafts for remaining gaps; track coverage trends

The full process takes about two to three months to feel embedded. After that, CI catches violations, coverage trends stay visible, and the style guide functions as a project convention alongside naming and testing standards.