dbt Documentation Audience Mismatch

dbt documentation frequently goes unread. A 2024 Informatica survey found that 79% of organizations have undocumented data pipelines. Across 50,000+ teams using dbt weekly, the gap between what the documentation system can produce and what teams actually maintain remains wide. The issues are structural: audience mismatch, interface limitations, scope constraints, and enforcement gaps.

Audience mismatch

Engineers write documentation from an engineering perspective. The people who most need data definitions — analysts, marketing directors, product managers — find technically-oriented documentation unintuitive and unhelpful.

As Select Star’s 2025 analysis puts it: “Many teammates never open dbt Docs. Non-technical users want ERDs, lineage, and in-app search. When definitions live only in YAML, they are hard to find.”

A model described as “Joins stg_orders with stg_customers on customer_id and filters for completed status” tells an engineer how the model works but tells an analyst nothing about what they can use it for. The same model described as “Completed e-commerce orders with customer details, one row per order, excludes cancellations” tells the analyst everything they need to decide if this is the right table for their analysis.

If a dbt project serves only analytics engineers, technically-oriented descriptions are fine. Most dbt projects serve broader audiences.

Interface limitations

Even when teams do write thorough descriptions, dbt docs generate produces a static site aimed at data engineers. The DAG visualization, the model tree, the SQL-centric layout — all of it assumes the reader thinks in terms of transformations and lineage.

Business users who need to look up what “customer lifetime value” means in your organization don’t navigate by DAG. They want to search for a concept and find a clear definition. The default dbt docs interface doesn’t support that workflow well.

For dbt Core users, there’s an additional barrier: hosting. The built-in dbt docs serve command is explicitly not for production. Alternatives — GitHub Pages, S3 with Lambda, Netlify — require DevOps work that most data teams can’t prioritize. The result is documentation that technically exists but practically isn’t available to the people who need it.

Pushing descriptions to warehouse comments via persist_docs bypasses this problem entirely. Your descriptions appear in BigQuery’s schema panel, Snowflake’s column comments, and any BI tool that reads warehouse metadata. Analysts see your documentation without ever opening the docs site. If adoption is your primary problem, this is often the highest-leverage fix.

Scope constraints

dbt Docs only covers assets managed within the dbt project. BI tools, non-dbt warehouse tables, Fivetran sources, Airflow DAGs, and external pipelines stay invisible. A business user looking for documentation on the data behind their Looker dashboard may not find what they need because the relevant table sits outside the dbt boundary.

Data consumers get a partial picture that may not include the tables they care about most. This is a fundamental limitation of dbt documentation as a standalone solution. Teams where the data platform includes more than “dbt + a warehouse” eventually need a data catalog that pulls metadata from all their tools into a unified view.

Maintenance drift

Without enforcement, documentation coverage erodes sprint over sprint. Ownership stays unclear and reviews are ad hoc. Descriptions fall out of sync with the models they describe — stale documentation actively misleads.

Even when teams initially write business-friendly descriptions, those descriptions drift as the project evolves. The engineers who notice the drift are the least likely to update prose descriptions for a non-engineering audience.

Patterns that improve adoption

Write for business purpose rather than technical function: what the data means, not how it’s built
Put descriptions where people already work (warehouse comments, BI tool integrations) rather than expecting people to visit a separate docs site
Customize the docs site entry point with an __overview__ page that orients non-technical users
Use doc blocks with sections for different audiences — business context at the top, technical details below
Enforce documentation in CI so coverage does not erode silently
Treat documentation quality as infrastructure for AI tools that read schema descriptions to generate code