Dagster-dbt Asset Mapping

The dagster-dbt integration reads a dbt project’s manifest.json and creates one Dagster asset per model, seed, and snapshot. Dependencies from ref() and source() calls in SQL become edges in the Dagster asset graph. The compute function runs dbt build under the hood.

The Basic Setup

The core setup requires three components: a DbtProject pointing at your project directory, a manifest for Dagster to read, and a @dbt_assets-decorated function.

from dagster_dbt import DbtCliResource, DbtProject, dbt_assets
from pathlib import Path

my_project = DbtProject(project_dir=Path("./transform"))
my_project.prepare_if_dev()

@dbt_assets(manifest=my_project.manifest_path)
def my_dbt_assets(context, dbt: DbtCliResource):
    yield from dbt.cli(["build"], context=context).stream()

Three things happen here. DbtProject points Dagster at your dbt project directory. prepare_if_dev() runs dbt parse to generate manifest.json during local development — in production, you build the manifest at deploy time with dagster-dbt project prepare-and-package. The @dbt_assets decorator reads that manifest and creates one asset per node.

When you run dagster dev and open localhost:3000, every dbt model appears in the asset catalog with its upstream and downstream dependencies visualized. If mrt__marketing__campaign_performance depends on int__google_ads__clicks and int__meta_ads__impressions, the graph shows those relationships automatically, pulled straight from your SQL ref() calls.

Customizing the Mapping with DagsterDbtTranslator

The default mapping works for most projects, but DagsterDbtTranslator lets you control how dbt nodes become Dagster assets. You subclass it and override specific methods.

Custom Asset Keys and Groups

By default, Dagster uses the dbt model name as the asset key. If your project follows a layered naming convention with prefixes like base__, int__, and mrt__, the defaults work well. But if you need to map models into specific Dagster groups or adjust key prefixes:

from dagster_dbt import DagsterDbtTranslator, DbtCliResource, dbt_assets
from dagster import AssetKey

class CustomTranslator(DagsterDbtTranslator):
    def get_asset_key(self, dbt_resource_props):
        # Group assets by dbt model path: marts/marketing/model -> ["marketing", "model"]
        node_path = dbt_resource_props["path"]
        components = Path(node_path).stem
        return AssetKey(components)

    def get_group_name(self, dbt_resource_props):
        # Use the dbt folder as the Dagster group
        return Path(dbt_resource_props["path"]).parts[0]

@dbt_assets(
    manifest=my_project.manifest_path,
    dagster_dbt_translator=CustomTranslator(),
)
def my_dbt_assets(context, dbt: DbtCliResource):
    yield from dbt.cli(["build"], context=context).stream()

The get_group_name override is particularly useful for teams using folder-based organization in dbt. If your models live in marts/marketing/ and marts/finance/, Dagster groups them accordingly in the UI, making it easy to filter the asset graph by business domain.

Tags, Owners, and Metadata from dbt Meta

Dagster reads dbt’s meta configuration. Properties you set in your schema.yml carry through to the Dagster UI:

models:
  - name: mrt__finance__monthly_revenue
    meta:
      dagster:
        owners: ["team:finance", "adrienne@example.com"]
        group: finance
    tags:
      - daily
      - critical
    columns:
      - name: revenue__total_usd
        description: "Total revenue in USD for the month"

Tags from dbt map to Dagster tags. Owners specified in meta.dagster.owners appear in the asset catalog and can be used to filter the lineage graph by team. This means your existing dbt project documentation doubles as Dagster metadata — no need to maintain two sources of truth for ownership and classification.

Filtering Which Models Become Assets

Not every dbt node needs to be a Dagster asset. Ephemeral models don’t produce tables, so they have no materialization to track. You can exclude them by overriding get_asset_key to return None, or use dbt selection syntax in the @dbt_assets decorator’s select parameter:

@dbt_assets(
    manifest=my_project.manifest_path,
    select="tag:dagster",  # Only models tagged with 'dagster' in dbt
)
def my_dbt_assets(context, dbt: DbtCliResource):
    yield from dbt.cli(["build"], context=context).stream()

This gives you fine-grained control. Tag critical models in dbt, and only those become tracked assets in Dagster. For large dbt projects with hundreds of models, filtering reduces noise in the Dagster UI and focuses monitoring on the models that matter most.

Project Setup: Two Paths

The Components Approach (Recommended for New Projects)

Since the 1.12 cycle, Dagster recommends the DbtProjectComponent for new dbt integrations. The dg CLI scaffolds everything:

dg scaffold defs dagster_dbt.DbtProjectComponent transform \
  --project-path ./transform

This creates a defs.yaml that handles manifest compilation and caching automatically. The component generates all assets from your dbt project with minimal Python. For teams that want the mapping without heavy customization, this is the fastest path.

The Traditional Approach

For existing projects or teams that want more control, the traditional scaffold generates Python files directly:

dagster-dbt project scaffold \
  --project-name my_project \
  --dbt-project-dir ./transform

This produces a Python package with the @dbt_assets decorator, a DbtProject definition, and resource configuration. You edit the Python files directly to customize behavior — the DagsterDbtTranslator overrides, the select filters, the schedule definitions all live in code you own.

Manifest Handling

The manifest is the bridge between dbt and Dagster. During local development, prepare_if_dev() or the Components framework handles manifest generation transparently. For production deploys, you build the manifest as part of your CI/CD pipeline:

# In CI/CD pipeline
dbt deps --project-dir ./transform
dagster-dbt project prepare-and-package \
  --file ./my_project/project.py

This compiles the manifest once at deploy time, so Dagster doesn’t need to run dbt parse in production. The separation matters: local development stays fast (parse on the fly), production stays deterministic (pre-built manifest).

What the Mapping Gives You

The mapping isn’t just a visual convenience. Once your dbt models exist as Dagster assets, you gain:

Cross-system lineage. dbt models appear alongside Python assets, Fivetran syncs, and any other Dagster-managed data in one graph. Your dbt transformation layer isn’t isolated — it’s connected to everything upstream and downstream.
Unified execution. dbt build runs through Dagster’s orchestration, which means scheduling, retry policies, and freshness tracking all apply.
Asset-level monitoring. Each model has its own materialization history, duration tracking, and quality checks. You can see when mrt__finance__monthly_revenue last ran, how long it took, and whether its tests passed.
Selective materialization. In the Dagster UI, you can select specific assets and materialize just those, plus their downstream dependencies. Useful for ad-hoc fixes or targeted backfills without running the entire dbt project.

The dbt project does not change when adding Dagster. Dagster wraps it and adds orchestration capabilities on top.