The key decisions for organizing semantic model and metric YAML files are where they live relative to your SQL models, and how semantic models map to your existing dbt layers. A co-located approach works for small projects; larger projects benefit from a parallel subfolder structure.
Co-located Structure
For projects with fewer than 20 metrics, keep semantic models and metrics together with the SQL model they describe:
models/ marts/ mrt__finance__orders.sql mrt__finance__orders.yml # semantic model + metrics mrt__sales__customers.sql mrt__sales__customers.yml # semantic model + metricsThe YAML file contains both the semantic model definition and any metrics built from it. This mirrors the standard dbt convention of co-locating model configs with SQL files.
The advantage is simplicity. When you open a model, you see its metrics right there. When you modify the SQL, you immediately see which metrics might be affected.
The limitation is that metrics often span multiple semantic models. A customer_lifetime_value metric might reference measures from both orders and customers. In a co-located structure, where does that metric live? You end up making arbitrary placement decisions that break the “find it in 10 seconds” rule.
Parallel Sub-folder Structure
For larger projects, separate semantic models from metrics and organize by domain:
models/ marts/ mrt__finance__orders.sql mrt__sales__customers.sql semantic_models/ orders.yml customers.yml metrics/ revenue_metrics.yml customer_metrics.yml conversion_metrics.ymlThis structure scales because cross-model metrics have a natural home. customer_lifetime_value goes in customer_metrics.yml regardless of which semantic models it references. Revenue metrics that combine orders and refund data go in revenue_metrics.yml.
The domain-based metric file grouping mirrors how users think about metrics. “Where are the revenue metrics?” has an obvious answer: metrics/revenue_metrics.yml. This aligns with the prefix-based naming convention where all revenue metrics start with revenue_.
The trade-off is indirection. Modifying mrt__finance__orders.sql requires checking a separate directory for affected semantic models and metrics. In a co-located structure, you see the impact immediately.
The One Primary Entity Rule
Each semantic model should have exactly one primary entity, typically aligned with one of your mart-layer models:
semantic_models: - name: orders defaults: agg_time_dimension: ordered_at model: ref('mrt__finance__orders') entities: - name: order type: primary - name: customer type: foreign - name: product type: foreignThe primary entity (order) identifies what each row represents. Foreign entities (customer, product) enable joins to other semantic models. MetricFlow uses these entity relationships to build the semantic graph — the map of how your data connects.
This constraint keeps the graph navigable. If a semantic model has two primary entities, MetricFlow cannot determine the grain of the model, and joins become ambiguous. One row, one primary entity, no exceptions.
The mapping is usually straightforward: your mart mrt__finance__orders maps to semantic model orders with primary entity order. Your mart mrt__sales__customers maps to semantic model customers with primary entity customer. The semantic model is a thin layer on top of the mart, adding entity, dimension, and measure annotations to the columns that already exist.
Semantic Models Map to Marts
Semantic models should reference mart-layer models, not base or intermediate models. This is consistent with the three-layer architecture: marts are the consumption layer. Semantic models sit on top of that consumption layer, adding metadata for the semantic layer.
Pointing a semantic model at an intermediate model creates fragile coupling. Intermediate models are internal to your dbt project — they can be refactored, split, or merged without notice to downstream consumers. Marts, by contrast, are stable interfaces with defined consumers.
# Good: semantic model on a martsemantic_models: - name: orders model: ref('mrt__finance__orders')
# Bad: semantic model on an intermediatesemantic_models: - name: orders model: ref('int__order__order_lj_customer')The intermediate model might change its grain or get renamed during a refactor. The mart is a contract with its consumers.
Validation in CI
However you organize your files, validate the semantic layer in CI:
- name: Validate semantic layer run: dbt sl validate# Or for dbt Core:mf validate-configsMetricFlow validates at three levels:
- Parsing validation — Does the YAML follow the schema?
- Semantic validation — Are names unique? Do references exist? Is there exactly one primary entity?
- Data platform validation — Do the referenced columns exist in physical tables?
Add --verbose-issues --show-all when debugging failures. Validation catches broken references, duplicate names, and missing entities before they reach production.
This is especially important with the parallel subfolder structure, where a change to a semantic model in one directory can break metrics defined in another. CI validation is the safety net that makes the separation of files safe.
When to Restructure
The signal to move from co-located to parallel structure is not a specific metric count. It is the moment you find yourself putting a metric in a file where it does not logically belong because “it has to go somewhere.” That is the co-located structure breaking down.
A project with 30 metrics that all map 1-to-1 to single semantic models works fine with co-location. A project with 15 metrics where half of them are derived or ratio metrics spanning multiple semantic models needs the parallel structure from day one.
Start co-located. Migrate when placement decisions become arbitrary. The restructuring is a file move, not a logic change — metric and semantic model definitions stay identical regardless of which directory they live in.