Metrics as Code

Metrics-as-code is the practice of defining business metrics — revenue, conversion rate, active users, churn — in version-controlled configuration files (typically YAML) that live alongside your data transformation code. Metric definitions go through pull requests, code review, automated testing, and CI/CD pipelines, exactly like the SQL models they depend on.

This is the analytics equivalent of infrastructure-as-code. Instead of metrics being defined inside BI tools (where they’re invisible to engineering workflows, difficult to audit, and impossible to test systematically), they become first-class citizens of the codebase.

Why This Pattern Exists

The traditional approach to business metrics is distributed definition: each BI dashboard, each SQL query, each Python notebook contains its own calculation of “revenue” or “conversion rate.” These definitions diverge silently — Finance may calculate revenue including pending orders, Marketing may exclude them, and the executive dashboard may use a third definition. Metrics-as-code centralizes definitions in a single, auditable location and pushes them to all downstream consumers.

What It Looks Like in Practice

Metric Definitions in dbt MetricFlow

The most mature implementation is dbt’s MetricFlow, where metrics are YAML definitions in your dbt project:

metrics:
  - name: monthly_recurring_revenue
    description: >
      Total recurring revenue for the month, calculated from active
      subscriptions. Excludes one-time charges and usage-based fees.
    type: simple
    type_params:
      measure: mrr_amount
    filter: |
      {{ Dimension('subscription_id__status') }} = 'active'
    meta:
      owner: finance-analytics
      tier: certified
      last_reviewed: 2026-03-15

The description field serves double duty: documentation for humans and context for AI copilots that need to understand what the metric represents. The meta block captures governance information — who owns the definition, whether it’s been certified, when it was last reviewed.

Derived and Ratio Metrics

Simple metrics aggregate a single measure. Derived metrics compose other metrics:

metrics:
  - name: average_revenue_per_user
    description: "ARPU: monthly recurring revenue divided by active user count"
    type: derived
    type_params:
      expr: monthly_recurring_revenue / active_users
      metrics:
        - name: monthly_recurring_revenue
        - name: active_users

  - name: gross_margin
    description: "Revenue minus COGS, divided by revenue"
    type: derived
    type_params:
      expr: (revenue - cost_of_goods_sold) / revenue
      metrics:
        - name: revenue
        - name: cost_of_goods_sold

This composability is critical. When the definition of revenue changes (say, you start excluding a new category of refunds), average_revenue_per_user and gross_margin automatically update. No dashboard hunting. No “did we update all the places that reference revenue?”

Cumulative and Period-Over-Period Metrics

MetricFlow supports time-windowed aggregations natively:

metrics:
  - name: trailing_12m_revenue
    description: "Rolling 12-month revenue window"
    type: cumulative
    type_params:
      measure: revenue_amount
      window: 12 months

  - name: revenue_growth_rate
    description: "Month-over-month revenue growth as a percentage"
    type: derived
    type_params:
      expr: (current_revenue - previous_revenue) / previous_revenue
      metrics:
        - name: current_revenue
          offset_window: 0
        - name: previous_revenue
          offset_window: 1 month

These definitions would be complex SQL with window functions and date arithmetic. Expressed as YAML, they’re readable by non-technical stakeholders who can review whether the business logic is correct even if they can’t write SQL.

The Workflow

Metrics-as-code only delivers value if the workflow enforces the practice. The lifecycle looks like this:

1. Define. An analyst or analytics engineer writes the metric YAML in a feature branch, including description, owner, and any filters.

2. Review. A pull request shows exactly what changed. Reviewers can see the metric formula, the underlying measure, the filters applied. The diff is meaningful:

metrics:
  - name: revenue
    type: simple
    type_params:
      measure: order_total
    filter: |
      {{ Dimension('order_id__status') }} = 'completed'
    filter: |
      {{ Dimension('order_id__status') }} IN ('completed', 'partially_refunded')

This is a material change to a core business metric. In a BI-tool-defined metric, this change would be invisible to code review. In YAML, it’s a two-line diff that triggers the right conversation.

3. Test. CI runs validate the metric:

# In your dbt project's tests
unit_tests:
  - name: test_revenue_excludes_cancelled
    model: semantic_model_orders
    given:
      - input: ref('base__shopify__orders')
        rows:
          - {order_id: 1, amount: 100, status: "completed"}
          - {order_id: 2, amount: 50, status: "cancelled"}
    expect:
      rows:
        - {order_total: 100}

Automated tests catch regressions: did this change break the expected output? Does the metric still match the finance team’s validated numbers for the reference period?

4. Deploy. Merge to main. The semantic layer picks up the new definition. Every downstream consumer — BI dashboards, AI copilots, scheduled reports, embedded analytics — automatically uses the updated metric without any manual changes.

5. Discover. dbt generates documentation from the YAML, making metrics searchable. Teams can browse available metrics, read descriptions, understand filters, and see which measures they’re built on.

Which Tools Support It

The metrics-as-code pattern has moved from niche to expected across the modern BI landscape:

Full native support:

Lightdash reads metric definitions directly from dbt project YAML. No separate modeling layer. What you define in dbt is exactly what appears in the BI tool.
Evidence uses SQL + Markdown for reports, with metrics defined in code and deployed as static sites. Reports are literally files in a Git repository.

Semantic layer consumers:

Looker has a dbt Semantic Layer connector, though LookML remains its primary modeling language.
Steep integrates with dbt Cloud’s semantic layer API.
Any tool connecting to the dbt Cloud Semantic Layer API (JDBC/GraphQL/REST) consumes metrics-as-code definitions.

Moving in this direction:

Power BI adopted PBIR (Enhanced Report Format) as default in January 2026. PBIR is folder-based and Git-friendly, bringing report definitions closer to code workflows, though metric definitions still live inside the Power BI model rather than in external YAML.
Tableau shipped a dbt Semantic Layer connector in Tableau Cloud 2025.2.

The Governance Gap Without Metrics-as-Code

Without centralized, version-controlled metric definitions, governance degrades predictably:

Problem	What Happens
Metric drift	5 dashboards show 5 different revenue numbers
Shadow metrics	Analysts create ad-hoc calculations that become de facto standards
Audit failure	No history of when or why a metric definition changed
AI hallucination	Copilots invent metric calculations with no governed vocabulary
Onboarding friction	New team members spend weeks discovering which metrics exist

The 73% BI implementation failure rate (SR Analytics) is attributed to this governance gap rather than to tool selection: metric definitions go unstandardized and dashboard trust erodes.

Adoption Patterns

Starting with 5–10 core business metrics that cause the most confusion or inconsistency is more practical than modeling everything at once.

The metric definition in code must be authoritative. A conflicting metric defined in a BI tool’s UI is a consistency problem to fix. meta fields for owner, tier (certified/exploratory/deprecated), and last reviewed date support discoverability and accountability.

Metric values should be tested against known-good reference data: if the finance team has validated Q4 revenue, the metric definition should reproduce that number exactly.

A modification to a core metric like revenue is a breaking change for every dashboard that displays it. Versioning and change communication apply to metric changes the same way they apply to schema changes.