ServicesAboutNotesContact Get in touch →
EN FR
Note

Documentation Quality Determines AI Usefulness

Why the quality of your dbt documentation directly determines how useful AI tools can be — the Roche chatbot failure, the docs-to-AI feedback loop, and case studies in enforcement

Planted
dbtaidata quality

Documentation quality directly determines how useful AI tools can be for querying, generating SQL from, and building on top of a data platform. Roche, the pharmaceutical company, built a chatbot on top of their technical documentation; engineers dismissed it as useless. After investigation, they concluded their documentation “wasn’t up to snuff.” Head of Engineering Yannick Misteli’s stated conclusion: “If your documentation isn’t good, your chatbot won’t be, either.”

The Feedback Loop

The relationship between documentation quality and AI tool usefulness runs in both directions:

Docs feed AI. AI tools read schema descriptions to understand what tables and columns represent. A model described as “Orders table” gives the AI nothing to work with. A model described as “Completed e-commerce orders from Shopify, one row per order, excludes cancelled and test transactions” gives the AI enough context to generate correct queries, recommend the right table for a business question, and avoid joining on the wrong grain.

AI feeds docs. AI documentation tools read existing descriptions as context when generating new ones. If your base model describes customer_id poorly, every downstream AI-generated description inherits that poor definition. The AI doesn’t just fail to improve bad documentation — it propagates it through your project.

Bad docs compound through AI outputs. When an AI tool confidently generates SQL based on a description that no longer reflects reality — because the description went stale — the resulting query is silently wrong. Unlike a syntax error that fails loudly, a semantic error (querying the wrong metric because the description said it was net revenue when it’s actually gross) produces plausible-looking results. The dashboard looks fine. The numbers are wrong.

The same description that helps an analyst understand a table also helps an AI assistant generate correct SQL. The multiplier effect from documentation quality has grown as AI tool adoption has increased.

What Enforcement Delivers

FINN, a global car subscription service, implemented pre-commit hooks for model ownership, naming conventions, and description requirements. Datafold reported that FINN “broke through a speed-quality frontier” where automated checks eliminated the tradeoff between data quality and developer velocity. Enforcement was automatic, producing both faster development and higher documentation coverage.

Kiwi.com paired dbt with Atlan and cut engineering documentation workload by 53% in 90 days. The key insight: combining dbt’s authoring layer with a dedicated catalog reduced the burden instead of doubling it. Engineers wrote descriptions once in YAML, and the catalog made those descriptions discoverable to business users and AI tools alike.

Dmytro Arkhypov’s team found that their Confluence pages had become unreliable — some models documented in an outdated state, others stuck in draft. Moving documentation into dbt made it possible to automate validation. They implemented dbt-checkpoint for enforcement and built a custom doc generator that cut manual documentation time in half. The move from a wiki to in-code documentation wasn’t about the documentation format; it was about making automation possible.

Documentation that lives near the code and is enforced automatically stays accurate. Documentation that lives in a separate system and depends on discipline drifts.

The AI-Ready Documentation Standard

What makes documentation “AI-ready” is the same thing that makes it useful for humans, with a few additional considerations:

  • Explicit grain statements. “One row per customer per day” helps an AI avoid incorrect aggregations. Without it, the AI guesses, and often guesses wrong.
  • Explicit exclusions. “Excludes cancelled and test transactions” prevents an AI from including records that should be filtered. This is the most common source of AI-generated SQL errors from poor documentation.
  • Consistent naming conventions. AI tools learn patterns from your project. If some models use amount to mean gross revenue and others use it to mean net revenue, the AI has no way to disambiguate.
  • Business context, not just technical function. RAG pipelines and CLAUDE.md files help bridge this gap, but the foundation is descriptions that explain what the data means to the business, not just how the SQL transforms it.

The three-question framework — source system, grain, inclusions/exclusions — produces descriptions that serve both humans and AI tools well. Descriptions that answer these questions give AI enough context to generate useful code. Descriptions that don’t answer them force the AI to hallucinate context, which is how you get queries that look correct but join on the wrong key or filter on the wrong status.

Practical Implications

Documentation quality affects the productivity of AI-augmented workflows across the data stack: SQL generation, data exploration, automated testing, anomaly investigation. A mart model where every description includes source system, grain, and exclusions gives AI tools enough context to generate correct queries. A model described only as “Orders table” does not.

Relevant implementation notes: dbt Documentation CI Enforcement covers automated enforcement; dbt documentation automation strategy covers maintenance patterns; RAG for dbt Documentation covers the business-context layer that automation cannot generate.