Adrienne Vermorel

Getting Started with the dbt Semantic Layer and MetricFlow

The dbt Semantic Layer lets you define metrics once and query them from any tool, so your organization gets consistent numbers everywhere. Setting it up is more accessible than you might expect.

This guide walks through installing MetricFlow, configuring your first semantic model, defining metrics, and running queries. By the end, you’ll have a working semantic layer you can extend as your needs grow.

Understanding the architecture

MetricFlow is the SQL query generation engine behind the dbt Semantic Layer. When you define metrics in YAML, MetricFlow builds a semantic graph (a relationship structure that maps how your tables connect through entities). Think of it as a navigation system that knows how to join tables without you spelling out every relationship.

When you query a metric, MetricFlow:

  1. Reads your YAML definitions
  2. Determines which joins are needed
  3. Generates optimized SQL for your data platform
  4. Executes the query and returns results

MetricFlow handles join logic automatically. You define entities (the join keys) once, and it figures out how to navigate between tables.

dbt Core vs dbt Cloud capabilities

Capabilities differ between the two:

Featuredbt Coredbt Cloud
Define semantic models and metricsYesYes
Generate SQL from configsYesYes
Query via CLImf commandsdbt sl commands
Query via APIs (JDBC, GraphQL)NoYes
BI tool integrationsNoYes
Exports to materialize metricsNoYes

If you’re on dbt Core, you get full metric definition and CLI querying. The API layer and BI integrations require dbt Cloud.

Installation

For dbt Core, install the MetricFlow bundle matching your adapter:

Terminal window
pip install "dbt-metricflow[dbt-snowflake]"

Replace dbt-snowflake with your adapter: dbt-bigquery, dbt-databricks, dbt-redshift, or dbt-postgres.

For dbt Cloud, MetricFlow is built in. You’ll use dbt sl commands instead of mf.

Supported platforms: Snowflake, BigQuery, Databricks, Redshift, Postgres (Core only), and Trino.

Setting up the time spine

Cumulative metrics and time-based calculations require a time spine, a table with one row per day. MetricFlow uses this as a reliable date dimension to join against.

Create a model called metricflow_time_spine.sql:

{{ config(materialized='table') }}
{{ dbt.date_spine(
datepart="day",
start_date="CAST('2020-01-01' AS DATE)",
end_date="CAST('2030-12-31' AS DATE)"
) }}

Then configure it in your dbt_project.yml:

semantic-layer:
time_spine:
standard_granularity_column: date_day

Run dbt build -s metricflow_time_spine to materialize it.

Defining a semantic model

Semantic models are YAML abstractions that describe your dbt models for MetricFlow. Each semantic model has a one-to-one relationship with a dbt SQL or Python model.

This semantic model describes an orders table:

models/semantic/sem_orders.yml
semantic_models:
- name: orders
description: Order transactions with revenue and quantity
model: ref('mrt__sales__orders')
defaults:
agg_time_dimension: order__created_at
entities:
- name: order_id
type: primary
- name: customer_id
type: foreign
- name: product_id
type: foreign
dimensions:
- name: order__created_at
type: time
type_params:
time_granularity: day
- name: order__status
type: categorical
- name: order__channel
type: categorical
measures:
- name: order_total
agg: sum
expr: order__amount
description: Sum of order amounts
- name: order_count
agg: count
expr: order_id
description: Count of orders
- name: distinct_customers
agg: count_distinct
expr: customer_id

Breaking down the components

Entities are join keys that connect semantic models. Four types exist:

  • primary: One record per row, no nulls allowed (like order_id)
  • unique: One per row, nulls allowed
  • foreign: Zero to many instances (like customer_id in an orders table)
  • natural: Columns uniquely identifying records based on real-world data

Dimensions enable grouping, slicing, and filtering. Use time for dates and categorical for string values.

Measures are the numeric aggregations that become building blocks for metrics. Supported aggregations: sum, count, count_distinct, avg, min, max, median, and percentile.

Defining metrics

Metrics reference the measures you defined. These simple metrics build on the orders semantic model:

models/semantic/metrics_orders.yml
metrics:
- name: revenue
description: Total order revenue
type: simple
label: Revenue
type_params:
measure: order_total
- name: orders
description: Total number of orders
type: simple
label: Order Count
type_params:
measure: order_count
- name: unique_customers
description: Count of distinct customers
type: simple
label: Unique Customers
type_params:
measure: distinct_customers

Metric types for different needs

Simple metrics directly reference a measure, like the examples above.

Derived metrics calculate from other metrics. This example computes revenue per customer:

metrics:
- name: revenue_per_customer
description: Average revenue per unique customer
type: derived
label: Revenue per Customer
type_params:
expr: revenue / unique_customers
metrics:
- name: revenue
- name: unique_customers

Ratio metrics compute ratios correctly by aggregating numerator and denominator independently before dividing:

metrics:
- name: average_order_value
description: Average value per order
type: ratio
label: AOV
type_params:
numerator: order_total
denominator: order_count

Cumulative metrics aggregate over time windows:

metrics:
- name: revenue_mtd
description: Month-to-date revenue
type: cumulative
label: Revenue MTD
type_params:
measure: order_total
grain_to_date: month

Validating your configuration

Before querying, validate that your definitions parse correctly:

Terminal window
# dbt Core
mf validate-configs
# dbt Cloud
dbt sl validate

If validation fails, check for these common issues:

  • “ensure that you’ve ran an artifacts…” error: Run dbt parse or dbt build first to generate the semantic manifest.
  • Metafont conflict: The mf command can conflict with the Metafont LaTeX package. Uninstall it if you see unexpected behavior.
  • Dimensions not appearing: Every dimension needs a primary entity in its semantic model.

Querying metrics

Generate the semantic manifest first:

Terminal window
dbt parse

Then query your metrics:

Terminal window
# dbt Core
mf query --metrics revenue --group-by metric_time
# dbt Cloud
dbt sl query --metrics revenue --group-by metric_time

Add dimensions for slicing:

Terminal window
mf query --metrics revenue,orders --group-by metric_time,order__channel

Filter with the --where flag:

Terminal window
mf query --metrics revenue --group-by metric_time --where "{{ Dimension('orders__order__channel') }} = 'web'"

The filter syntax uses Jinja templating. Reference dimensions as {{ Dimension('semantic_model__dimension_name') }}.

Organizing semantic models at scale

For smaller projects, co-locate semantic definitions with your dbt models:

models/
marts/
mrt__sales__orders.sql
mrt__sales__orders.yml # semantic model + metrics
mrt__sales__customers.sql
mrt__sales__customers.yml

For larger projects, separate into dedicated directories:

models/
marts/
mrt__sales__orders.sql
mrt__sales__customers.sql
semantic/
semantic_models/
sem_orders.yml
sem_customers.yml
metrics/
revenue_metrics.yml
customer_metrics.yml

A complete working example

Putting it all together with an orders model:

-- models/marts/mrt__sales__orders.sql
SELECT
order_id,
customer_id,
created_at AS order__created_at,
channel AS order__channel,
status AS order__status,
amount AS order__amount
FROM {{ ref('int__orders_enriched') }}

Define the semantic layer:

models/semantic/sem_orders.yml
semantic_models:
- name: orders
model: ref('mrt__sales__orders')
defaults:
agg_time_dimension: order__created_at
entities:
- name: order_id
type: primary
- name: customer_id
type: foreign
dimensions:
- name: order__created_at
type: time
type_params:
time_granularity: day
- name: order__status
type: categorical
- name: order__channel
type: categorical
measures:
- name: order_total
agg: sum
expr: order__amount
- name: order_count
agg: count
expr: order_id
metrics:
- name: revenue
type: simple
type_params:
measure: order_total
- name: orders
type: simple
type_params:
measure: order_count
- name: aov
type: ratio
type_params:
numerator: order_total
denominator: order_count

Build and query:

Terminal window
dbt build
dbt parse
mf query --metrics revenue,aov --group-by metric_time,order__channel

What comes next

Once your semantic layer is running, you can:

  • Add more semantic models for customers, products, and other entities. MetricFlow will automatically handle joins through shared entities.
  • Define period-over-period metrics using derived metrics with offset_window for comparisons like month-over-month growth.
  • Connect BI tools (dbt Cloud) through JDBC or the GraphQL API.
  • Enable LLM access by exposing your governed metrics to AI tools that can query natural language against consistent definitions.

The real value shows up when multiple teams query the same metrics and get the same answers, without maintaining duplicate logic across dashboards and reports.