ServicesAboutNotesContact Get in touch →
EN FR
Note

dbt Docs Performance at Scale

Why the default dbt docs site becomes unusable for large projects — the AngularJS frontend, client-side JSON parsing, and the performance ceiling that drives teams to alternatives

Planted
dbtdata engineering

The default dbt docs site works fine for small projects. At larger scale, performance degrades due to the client-side architecture. The notes below describe the bottleneck and the mitigations available within the default site, as context for deciding when to move to alternatives.

The architecture problem

The dbt docs frontend is an AngularJS 1.x single-page application. AngularJS reached end-of-life in December 2021. The framework has not received updates since, and the dbt-docs repository reflects this — the frontend architecture has remained largely unchanged.

All parsing of manifest.json and catalog.json happens client-side, on the browser’s main thread. When a user opens the docs site:

  1. The browser downloads manifest.json and catalog.json (or the combined static_index.html)
  2. JavaScript parses the full JSON payloads into memory
  3. AngularJS builds the model tree, resolves relationships, and renders the UI
  4. The user can interact with the site

Every step happens in the browser. There is no server-side rendering, no pagination, no lazy loading of model data. The entire project is loaded into memory at once.

The numbers at scale

Dagster analyzed GitLab’s dbt documentation site and published concrete benchmarks:

MetricValue
JSON payload size58 MB
Page load time~48 seconds
Browser memory usage~350 MB
Lighthouse performance score68/100

At that scale, the docs site is effectively unusable. Users wait nearly a minute for the page to become interactive. Browser tabs consume hundreds of megabytes of RAM. On machines with limited memory (common with corporate laptops running multiple applications), the tab may crash entirely.

The threshold is lower than you think

You do not need GitLab’s scale to feel the pain. Projects with 500+ models start to show noticeable sluggishness — a few seconds of loading, delayed search results, a DAG visualization that freezes while rendering complex lineage graphs. By 1,000 models, the experience degrades meaningfully for everyday use.

The --select flag on dbt docs generate helps by limiting catalog generation, but the manifest still contains your entire project. If your manifest alone is large (many models, macros, and cross-project references), the frontend performance suffers regardless of catalog scope.

VPN and network effects

Many teams host dbt docs behind a corporate VPN. Users accessing the site over VPN connections face slower transfer speeds, which amplify the JSON payload problem. A 20 MB payload that loads in 2 seconds on a local network might take 10-15 seconds over a VPN. Combined with the client-side parsing overhead, the total wait can push past 30 seconds for medium-sized projects.

If your team accesses docs remotely (common for distributed teams), factor transfer time into your hosting decision. CDN-backed hosting helps with transfer speed, but cannot fix the client-side parsing bottleneck.

What you can do within the default site

Short of replacing the frontend, a few mitigations help:

  • Use the --static flag to produce a single file, avoiding multiple HTTP requests for separate JSON files
  • Scope catalog generation with --select to reduce catalog.json size for the models that matter most
  • Hide intermediate models with docs.show: false to reduce the number of nodes the frontend processes (see dbt Docs Site Customization Options)
  • Host behind a CDN (CloudFront, Cloud CDN, Cloudflare) to minimize transfer time for the static assets

These are palliative measures. They buy time but do not solve the fundamental issue: the frontend was designed for projects with dozens of models, not hundreds or thousands.

When to switch

Signals that point toward replacement frontends or data catalogs: team complaints about load times, users avoiding the docs site, workarounds built to bypass the default UI. The performance ceiling of the AngularJS frontend is structural — better hosting or caching does not overcome it.