Metrics-as-code is the practice of defining business metrics — revenue, conversion rate, active users, churn — in version-controlled configuration files (typically YAML) that live alongside your data transformation code. Metric definitions go through pull requests, code review, automated testing, and CI/CD pipelines, exactly like the SQL models they depend on.
This is the analytics equivalent of infrastructure-as-code. Instead of metrics being defined inside BI tools (where they’re invisible to engineering workflows, difficult to audit, and impossible to test systematically), they become first-class citizens of the codebase.
Why This Pattern Exists
The traditional approach to business metrics is distributed definition: each BI dashboard, each SQL query, each Python notebook contains its own calculation of “revenue” or “conversion rate.” These definitions diverge silently — Finance may calculate revenue including pending orders, Marketing may exclude them, and the executive dashboard may use a third definition. Metrics-as-code centralizes definitions in a single, auditable location and pushes them to all downstream consumers.
What It Looks Like in Practice
Metric Definitions in dbt MetricFlow
The most mature implementation is dbt’s MetricFlow, where metrics are YAML definitions in your dbt project:
metrics: - name: monthly_recurring_revenue description: > Total recurring revenue for the month, calculated from active subscriptions. Excludes one-time charges and usage-based fees. type: simple type_params: measure: mrr_amount filter: | {{ Dimension('subscription_id__status') }} = 'active' meta: owner: finance-analytics tier: certified last_reviewed: 2026-03-15The description field serves double duty: documentation for humans and context for AI copilots that need to understand what the metric represents. The meta block captures governance information — who owns the definition, whether it’s been certified, when it was last reviewed.
Derived and Ratio Metrics
Simple metrics aggregate a single measure. Derived metrics compose other metrics:
metrics: - name: average_revenue_per_user description: "ARPU: monthly recurring revenue divided by active user count" type: derived type_params: expr: monthly_recurring_revenue / active_users metrics: - name: monthly_recurring_revenue - name: active_users
- name: gross_margin description: "Revenue minus COGS, divided by revenue" type: derived type_params: expr: (revenue - cost_of_goods_sold) / revenue metrics: - name: revenue - name: cost_of_goods_soldThis composability is critical. When the definition of revenue changes (say, you start excluding a new category of refunds), average_revenue_per_user and gross_margin automatically update. No dashboard hunting. No “did we update all the places that reference revenue?”
Cumulative and Period-Over-Period Metrics
MetricFlow supports time-windowed aggregations natively:
metrics: - name: trailing_12m_revenue description: "Rolling 12-month revenue window" type: cumulative type_params: measure: revenue_amount window: 12 months
- name: revenue_growth_rate description: "Month-over-month revenue growth as a percentage" type: derived type_params: expr: (current_revenue - previous_revenue) / previous_revenue metrics: - name: current_revenue offset_window: 0 - name: previous_revenue offset_window: 1 monthThese definitions would be complex SQL with window functions and date arithmetic. Expressed as YAML, they’re readable by non-technical stakeholders who can review whether the business logic is correct even if they can’t write SQL.
The Workflow
Metrics-as-code only delivers value if the workflow enforces the practice. The lifecycle looks like this:
1. Define. An analyst or analytics engineer writes the metric YAML in a feature branch, including description, owner, and any filters.
2. Review. A pull request shows exactly what changed. Reviewers can see the metric formula, the underlying measure, the filters applied. The diff is meaningful:
metrics: - name: revenue type: simple type_params: measure: order_total filter: | {{ Dimension('order_id__status') }} = 'completed' filter: | {{ Dimension('order_id__status') }} IN ('completed', 'partially_refunded')This is a material change to a core business metric. In a BI-tool-defined metric, this change would be invisible to code review. In YAML, it’s a two-line diff that triggers the right conversation.
3. Test. CI runs validate the metric:
# In your dbt project's testsunit_tests: - name: test_revenue_excludes_cancelled model: semantic_model_orders given: - input: ref('base__shopify__orders') rows: - {order_id: 1, amount: 100, status: "completed"} - {order_id: 2, amount: 50, status: "cancelled"} expect: rows: - {order_total: 100}Automated tests catch regressions: did this change break the expected output? Does the metric still match the finance team’s validated numbers for the reference period?
4. Deploy. Merge to main. The semantic layer picks up the new definition. Every downstream consumer — BI dashboards, AI copilots, scheduled reports, embedded analytics — automatically uses the updated metric without any manual changes.
5. Discover. dbt generates documentation from the YAML, making metrics searchable. Teams can browse available metrics, read descriptions, understand filters, and see which measures they’re built on.
Which Tools Support It
The metrics-as-code pattern has moved from niche to expected across the modern BI landscape:
Full native support:
- Lightdash reads metric definitions directly from dbt project YAML. No separate modeling layer. What you define in dbt is exactly what appears in the BI tool.
- Evidence uses SQL + Markdown for reports, with metrics defined in code and deployed as static sites. Reports are literally files in a Git repository.
Semantic layer consumers:
- Looker has a dbt Semantic Layer connector, though LookML remains its primary modeling language.
- Steep integrates with dbt Cloud’s semantic layer API.
- Any tool connecting to the dbt Cloud Semantic Layer API (JDBC/GraphQL/REST) consumes metrics-as-code definitions.
Moving in this direction:
- Power BI adopted PBIR (Enhanced Report Format) as default in January 2026. PBIR is folder-based and Git-friendly, bringing report definitions closer to code workflows, though metric definitions still live inside the Power BI model rather than in external YAML.
- Tableau shipped a dbt Semantic Layer connector in Tableau Cloud 2025.2.
The Governance Gap Without Metrics-as-Code
Without centralized, version-controlled metric definitions, governance degrades predictably:
| Problem | What Happens |
|---|---|
| Metric drift | 5 dashboards show 5 different revenue numbers |
| Shadow metrics | Analysts create ad-hoc calculations that become de facto standards |
| Audit failure | No history of when or why a metric definition changed |
| AI hallucination | Copilots invent metric calculations with no governed vocabulary |
| Onboarding friction | New team members spend weeks discovering which metrics exist |
The 73% BI implementation failure rate (SR Analytics) is attributed to this governance gap rather than to tool selection: metric definitions go unstandardized and dashboard trust erodes.
Adoption Patterns
Starting with 5–10 core business metrics that cause the most confusion or inconsistency is more practical than modeling everything at once.
The metric definition in code must be authoritative. A conflicting metric defined in a BI tool’s UI is a consistency problem to fix. meta fields for owner, tier (certified/exploratory/deprecated), and last reviewed date support discoverability and accountability.
Metric values should be tested against known-good reference data: if the finance team has validated Q4 revenue, the metric definition should reproduce that number exactly.
A modification to a core metric like revenue is a breaking change for every dashboard that displays it. Versioning and change communication apply to metric changes the same way they apply to schema changes.