ServicesAboutNotesContact Get in touch →
EN FR
Note

dbt Documentation CI Enforcement

Tools and patterns for enforcing dbt documentation completeness in CI — dbt-coverage, dbt-checkpoint, dbt-score, and dbt-bouncer

Planted
dbtdata qualityautomation

This note covers four tools for enforcing dbt documentation completeness in CI. Without enforcement, coverage erodes as models and columns are added without descriptions. The tools operate at different levels of granularity and can be layered.

dbt-coverage

dbt-coverage is the most straightforward enforcement tool. It calculates the percentage of columns with non-empty descriptions across your project and fails CI when coverage drops below a threshold.

Terminal window
# Generate a coverage report
dbt-coverage compute doc --manifest target/manifest.json
# Fail CI if coverage is below 80%
dbt-coverage compute doc --manifest target/manifest.json --cov-fail-under 0.80

--cov-fail-under 0.80 requires that 80% of all columns across all models have a non-empty description.

The critical nuance: dbt-coverage checks for non-empty descriptions, not quality. customer_id: "The ID of the customer" passes the coverage check while being practically useless. Coverage tools tell you whether documentation exists, not whether it’s good. That’s where human review (and AI review) comes in.

Progressive Thresholds

Rather than starting at 80% from day one, many teams ratchet up the threshold as they improve:

# CI pipeline
steps:
- name: Check documentation coverage
run: |
CURRENT=$(dbt-coverage compute doc --manifest target/manifest.json | grep "Total" | awk '{print $NF}')
# Fail if coverage decreased from last known baseline
if (( $(echo "$CURRENT < $BASELINE_COVERAGE" | bc -l) )); then
echo "Documentation coverage dropped from $BASELINE_COVERAGE to $CURRENT"
exit 1
fi

This approach prevents backsliding without requiring a specific absolute number. Each PR must maintain or improve coverage, never reduce it.

dbt-checkpoint

dbt-checkpoint catches undocumented columns before they even reach the PR stage by running as a pre-commit hook. The check-model-columns-have-desc hook validates that every column in your schema.yml files has a description:

.pre-commit-config.yaml
repos:
- repo: https://github.com/dbt-checkpoint/dbt-checkpoint
rev: v2.0.0
hooks:
- id: check-model-columns-have-desc
name: Check model columns have descriptions

Pre-commit hooks provide faster feedback than CI — the failure surfaces at commit time rather than after the pipeline runs. This catches the common failure mode of adding a column without updating the YAML.

dbt-checkpoint includes other useful hooks beyond documentation: check-model-has-tests-by-name ensures models have minimum test coverage, check-model-has-properties-file ensures every model has a corresponding YAML file, and check-source-has-freshness validates that sources have freshness checks.

dbt-score

dbt-score from Picnic Technologies goes beyond binary “has description / doesn’t have description” by assigning a 0-10 quality score per model. The score considers multiple factors: documentation coverage, test coverage, model naming conventions, and other configurable quality rules.

Terminal window
dbt-score score --manifest target/manifest.json

The output gives you a per-model score that helps prioritize documentation work. A model with a score of 3/10 needs more attention than one at 7/10. You can set a minimum score threshold in CI, similar to dbt-coverage but covering a broader definition of quality.

Where dbt-coverage answers “how much documentation exists?”, dbt-score answers “how good is this model overall?” Documentation is one component of a model’s quality score, alongside testing, naming, and structure.

dbt-bouncer

dbt-bouncer enforces configurable conventions across the entire project. It’s more flexible than the other tools because you define the rules:

dbt-bouncer.yml
manifest_checks:
- name: check_model_description_populated
include: "models/marts"
- name: check_column_description_populated
include: "models/marts"
- name: check_model_has_unique_test
- name: check_model_names
model_name_pattern: "^(base|int|mrt)__"

dbt-bouncer is particularly useful for teams with strong naming conventions and documentation standards that vary by layer. You might require 100% documentation for mart models (which external consumers query) while being more lenient on intermediate models (which are internal implementation details).

Layering These Tools

These tools aren’t mutually exclusive. A practical CI setup layers them:

ToolStageWhat it catches
dbt-checkpointPre-commitMissing descriptions on changed models
dbt-osmosisPre-commitSchema drift, missing YAML columns
dbt-coverageCI pipelineOverall documentation coverage decline
dbt-bouncerCI pipelineConvention violations, missing tests
dbt-scoreCI pipeline (optional)Overall model quality regression

The pre-commit hooks provide instant feedback to the developer. The CI checks provide project-wide enforcement that catches issues the local hooks might miss (like a model in a different directory affected by a schema change).

  1. Start with dbt-coverage --cov-fail-under 0.50 to establish a baseline without blocking work
  2. Add dbt-checkpoint’s check-model-columns-have-desc as a pre-commit hook
  3. Ratchet up the coverage threshold by 5-10% each month
  4. Add dbt-bouncer rules for naming conventions once coverage is stable
  5. Target 80% as a steady-state minimum, using scaffolding tools and AI documentation to close gaps