ServicesAboutNotesContact Get in touch →
EN FR
Note

Headless BI Pattern

The architectural pattern of decoupling the semantic layer from visualization — exposing metrics via APIs so any frontend, AI agent, or application can consume governed data

Planted
dbtanalyticsdata modeling

Headless BI separates the semantic layer from the visualization layer. Your metric definitions, dimension hierarchies, and access controls live in a standalone service that exposes them via APIs. Any frontend — a custom React dashboard, a Slack bot, a scheduled PDF report, an AI agent, a mobile app — queries the same governed metrics from the same endpoint. No BI tool required in the middle.

The name borrows from headless CMS architecture, where content is managed in a backend and served to any presentation layer via API. Headless BI applies the same principle to analytics: the “content” is your governed metrics, and the “presentation” is whatever needs to display them.

The Problem It Solves

Traditional BI tools bundle three things together:

  1. Semantic modeling — defining what metrics mean, how dimensions relate, what filters apply
  2. Query generation — translating metric requests into warehouse SQL
  3. Visualization — charts, dashboards, interactive exploration

Bundling these creates lock-in. Your metric definitions live inside Looker’s LookML or Tableau’s data model. Switching BI tools means redefining every metric. Serving metrics to a non-BI consumer (an internal tool, a customer-facing product, a Python script) means either duplicating definitions or routing everything through the BI tool’s API, which was designed for dashboards, not arbitrary consumers.

Headless BI unbundles the stack. The semantic layer handles #1 and #2. Visualization (#3) becomes a consumer, not the owner, of metric logic.

Architecture

A headless BI setup typically looks like this:

┌─────────────────────────┐
│ Data Warehouse │
│ (BigQuery / Snowflake) │
└────────────┬─────────────┘
┌────────────▼─────────────┐
│ Semantic Layer │
│ (MetricFlow / Cube.dev) │
│ │
│ - Metric definitions │
│ - Dimension hierarchies │
│ - Access controls │
│ - Query generation │
└────────────┬─────────────┘
│ API (REST / GraphQL / JDBC)
┌────────┼────────┬──────────┬──────────┐
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
BI Tool React Slack Bot AI Agent Scheduled
(Looker, App (@metrics) (Copilot) Report
Preset) (Email)

The semantic layer is the single source of truth. Each consumer sends a query like “give me monthly revenue by region for the last 6 months” and receives structured data back. The consumer handles presentation; the semantic layer handles meaning.

Leading Implementations

Cube.dev

Cube is the most established headless BI platform. It sits between your warehouse and your consumers, providing:

  • Data modeling in JavaScript or YAML that defines metrics, dimensions, joins, and pre-aggregations
  • API layer serving REST, GraphQL, and SQL interfaces
  • Caching and pre-aggregation that dramatically reduce warehouse query load
  • Access control with multi-tenant support

A Cube data model:

cube('Orders', {
sql_table: 'public.orders',
measures: {
count: {
type: 'count',
},
total_revenue: {
sql: 'amount',
type: 'sum',
filters: [{ sql: `${CUBE}.status = 'completed'` }],
},
average_order_value: {
sql: `${total_revenue} / ${count}`,
type: 'number',
},
},
dimensions: {
status: {
sql: 'status',
type: 'string',
},
created_at: {
sql: 'created_at',
type: 'time',
},
},
});

The pre-aggregation layer is Cube’s differentiator. It materializes commonly-requested metric combinations into cache tables, so repeated queries hit pre-computed results rather than running live warehouse queries. For high-traffic embedded analytics (customer-facing dashboards), this can reduce query latency from seconds to milliseconds and cut warehouse costs by orders of magnitude.

dbt Semantic Layer API

dbt’s Semantic Layer exposes MetricFlow metrics via three protocols:

  • JDBC — for BI tools and SQL-based consumers
  • GraphQL — for custom applications
  • REST — for lightweight integrations
# GraphQL query to the dbt Semantic Layer
query {
createQuery(
metrics: [{ name: "revenue" }]
groupBy: [{ name: "metric_time", grain: MONTH }]
where: [{ sql: "{{ Dimension('order_id__region') }} = 'EMEA'" }]
) {
queryId
}
}

The advantage over Cube is that metric definitions already live in your dbt project — no separate modeling layer to maintain. If your team has invested in MetricFlow YAML, the Semantic Layer API is the natural way to expose those metrics to non-BI consumers.

The disadvantage is that dbt’s API doesn’t include Cube’s pre-aggregation/caching layer, so every query hits the warehouse directly. For low-frequency analytical queries this is fine. For high-frequency embedded analytics serving hundreds of concurrent users, you may need Cube or a caching layer in front.

When Headless BI Fits

Headless BI is particularly relevant in three scenarios:

Embedded Analytics

The embedded analytics market is projected at $19.8B (Gartner). When you’re building analytics into a product — a SaaS app showing customers their usage metrics, an internal tool displaying operational KPIs — you need metric definitions available via API, not locked inside a BI tool. The consumer is a React component, not a Looker dashboard.

Key players in embedded analytics include Sigma, Metabase (Modular Embedding SDK), Cube.dev, Looker, and Omni (post-Explo acquisition). A headless architecture gives you flexibility to swap the visualization layer without redefining metrics.

Multi-Consumer Metric Delivery

When the same metric needs to appear in:

  • A BI dashboard for analysts
  • A Slack notification for on-call engineers
  • A customer-facing product page
  • An executive email digest
  • An AI agent answering natural language questions

…maintaining the metric definition in each consumer is unsustainable. The headless pattern centralizes the definition and lets each consumer pull what it needs.

AI-Powered Analytics

AI copilots and natural language query tools work dramatically better when they query a structured API with known metrics and dimensions rather than generating SQL against raw warehouse tables. The semantic layer API gives the AI a constrained vocabulary — “these are the metrics you can request, these are the dimensions you can group by” — which reduces hallucination and improves accuracy. Without this constraint, the AI must guess at table relationships, column semantics, and business rules.

When Headless BI Doesn’t Fit

Small teams with a single BI tool. If Metabase or Lightdash serves all your consumers and you don’t have embedded analytics or non-BI consumers, the added complexity of a separate semantic layer API isn’t justified. Your BI tool is your semantic layer, and that’s fine.

Heavy self-service exploration. Headless BI excels at delivering known metrics to known consumers. It’s less suited to ad-hoc exploration where analysts need to drag dimensions, create new calculations, and iterate visually. For that, a traditional BI tool’s interactive interface is still faster. The two approaches complement each other: headless for production metrics, BI tool for exploration.

No stable metric definitions yet. If your organization hasn’t agreed on what “revenue” means, a headless API will faithfully serve the wrong definition everywhere at once. The semantic layer has to be right before you scale its reach.

The Composable Stack

Headless BI fits into a broader trend toward composable data infrastructure. Rather than an all-in-one platform, teams assemble:

LayerToolRole
IngestionFivetran / dltMove data into warehouse
TransformationdbtClean, model, test
SemanticMetricFlow / CubeDefine metrics, expose API
VisualizationLightdash / Preset / customPresent to users
OrchestrationAirflow / DagsterSchedule and coordinate
ObservabilityElementary / Monte CarloMonitor data quality

Each layer is independently swappable. The semantic layer sits between transformation and presentation; the API is how that interface is exposed. The Fivetran-dbt Labs merger toward an integrated ingestion-transformation platform reinforces the trend toward the semantic API as the primary interface for metric consumption. Adoption depends on how many consumers need the metrics and whether layer-independence is a practical requirement.