Lightdash Joins and Fanout Protection

Joins in Lightdash let users combine dimensions and metrics from multiple dbt models in a single Explore query. Without joins, each model is an isolated island — users can only analyze dimensions and metrics that live on the same model. With joins, an orders model can pull in customer attributes from a CRM model, campaign data from a marketing model, or product details from a catalog model, all in a single query without anyone writing SQL.

Defining a Join

Joins go under meta.joins at the model level:

models:
  - name: mrt__sales__orders
    meta:
      primary_key: order__id
      joins:
        - join: mrt__crm__customers
          sql_on: >
            ${mrt__sales__orders.customer__id}
            = ${mrt__crm__customers.customer__id}
          relationship: many-to-one
          type: left

The sql_on clause uses Lightdash’s dimension reference syntax: ${model_name.column_name}. This is a Jinja-like template that gets compiled into the actual SQL join condition. Using dimension references rather than raw column names means the join respects any column aliasing or schema overrides you’ve configured.

The type defaults to left if omitted. The four options are inner, left, right, and full. Left joins are the right default for most BI use cases — you want all rows from the primary model, with nulls where the joined model has no match, rather than silently dropping unmatched rows.

The relationship Property and Why It Exists

The relationship property is not cosmetic. Lightdash uses it to determine which metrics from the joined model are safe to use.

The options are many-to-one, one-to-one, and one-to-many. Getting this wrong is one of the most common sources of inflated metrics in BI tools.

Here’s the problem: in a one-to-many join, each row in the primary model matches multiple rows in the joined model. This causes row duplication on the primary model’s side. Any metric that sums or counts rows from the primary model will be inflated by the multiplication factor.

Consider an orders model joined to a line_items model (one order, many line items). If you calculate total_revenue (a sum on the orders model) through this join, each order row gets counted once for every line item it contains. An order with five line items contributes five times to the sum. The result is wildly wrong.

Lightdash uses the relationship property to flag this risk. In a one-to-many join:

Flagged as risky: sum, average, count metrics from the primary model
Safe to use: min, max, count_distinct from the primary model

Lightdash warns users before executing a query that combines a flagged metric with a one-to-many joined dimension. The warning doesn’t block the query — users can proceed — but it surfaces the risk at the point of analysis rather than after the fact.

joins:
  - join: mrt__sales__line_items
    sql_on: >
      ${mrt__sales__orders.order__id}
      = ${mrt__sales__line_items.order__id}
    relationship: one-to-many    # triggers fanout warnings on sum/count metrics
    type: left

The correct fix for this pattern is usually in the dbt models themselves: if you need order-level metrics and line-item dimensions together, build a mart that pre-joins them at the right grain rather than relying on a runtime join. But when a runtime join is genuinely needed, the relationship property at least makes the risk visible.

Additional Join Properties

fields

Whitelists specific dimensions or metrics from the joined model. By default, all non-hidden fields from the joined model appear in the Explore sidebar. fields narrows that to just what’s useful:

joins:
  - join: mrt__crm__customers
    sql_on: >
      ${mrt__sales__orders.customer__id}
      = ${mrt__crm__customers.customer__id}
    relationship: many-to-one
    type: left
    fields:
      - customer__name
      - customer__segment
      - customer__country

This is particularly valuable for large models. If your customers model has 40 columns but you only need 3 of them in the context of the orders Explore, exposing all 40 creates a cluttered sidebar and exposes columns that might not make sense when accessed through the orders lens.

hidden

hidden: true joins the model without exposing it in the sidebar at all. The join still executes — the joined model’s columns become available for filtering — but users cannot see or select its dimensions as explicit fields:

joins:
  - join: mrt__access__user_permissions
    sql_on: >
      ${mrt__sales__orders.customer__id}
      = ${mrt__access__user_permissions.user__id}
    relationship: many-to-one
    type: inner
    hidden: true

The use case here is row-level filtering. A hidden join to a permissions model can enforce that users only see data they’re authorized for, without exposing the permission model as a navigable entity in the Explore view.

alias

Renames the joined model when the same model needs to appear twice:

joins:
  - join: mrt__crm__contacts
    alias: billing_contact
    sql_on: >
      ${mrt__sales__orders.billing_contact__id}
      = ${billing_contact.contact__id}
    relationship: many-to-one
    type: left
  - join: mrt__crm__contacts
    alias: shipping_contact
    sql_on: >
      ${mrt__sales__orders.shipping_contact__id}
      = ${shipping_contact.contact__id}
    relationship: many-to-one
    type: left

Without aliasing, you cannot join the same model twice — the second join would have no way to distinguish itself from the first. With aliasing, both joins use different names, and their fields appear as separate sections in the sidebar (billing_contact.* and shipping_contact.*).

When to Use Joins vs. Wider Mart Models

The tension in BI design: should you pre-join everything into a wide mart, or keep models narrow and use BI-layer joins?

Pre-joined marts perform better (the join executes once at transform time, not on every query), and there’s no fanout risk to manage. They’re the right choice when a join is frequent, the cardinality is stable, and the combined grain makes semantic sense as a single entity.

BI-layer joins are better for ad-hoc exploration — when you occasionally need to combine two models that don’t always go together, or when the cross-product of combinations would require too many pre-built marts. Lightdash’s fanout warnings make ad-hoc joins safer than they’d be in a tool without relationship-awareness.

The practical heuristic: if analysts are constantly joining models A and B in the Explore view, that’s a signal to pre-build that join in dbt. If they join A and B occasionally, for specific analyses, the Explore join is the right choice. Keep the mart layer for stable, frequently-needed combinations; let Lightdash joins handle the exploratory edge cases.