ServicesAboutNotesContact Get in touch →
EN FR
Note

Dagster Components

Dagster's newest major abstraction — YAML-configured objects that generate assets, checks, and schedules with minimal Python, lowering the barrier for SQL-first analytics engineers.

Planted
dbtdata engineeringautomation

Components are the newest major abstraction in Dagster, reaching GA in the 1.12 cycle (2025). They’re YAML-configured or lightweight Python objects that generate assets, checks, and schedules. The goal is reducing Python boilerplate for common patterns, especially dbt integration.

For analytics engineering teams preferring YAML over Python decorators, Components are the recommended starting path for new Dagster projects in 2025+.

The Problem Components Solve

The traditional Dagster setup for a dbt project requires writing Python:

from dagster_dbt import DbtCliResource, DbtProject, dbt_assets
from pathlib import Path
my_project = DbtProject(project_dir=Path("./transform"))
my_project.prepare_if_dev()
@dbt_assets(manifest=my_project.manifest_path)
def my_dbt_assets(context, dbt: DbtCliResource):
yield from dbt.cli(["build"], context=context).stream()

This isn’t a lot of code, but it requires understanding Python decorators, generators (yield from), resource injection, and the DbtProject / manifest lifecycle. For an analytics engineer whose primary skill is SQL, each of these concepts is a speed bump.

Components replace this with a YAML file:

type: dagster_dbt.DbtProjectComponent
params:
project_dir: ./transform

Same result — every dbt model becomes a tracked Dagster asset with automatic lineage, freshness policies, and quality checks. But the entry point is a configuration file, not a Python module.

Scaffolding with the dg CLI

The dg CLI is the companion tool for Components. It scaffolds the project structure and the defs.yaml that defines your components:

Terminal window
dg scaffold defs dagster_dbt.DbtProjectComponent transform \
--project-path ./transform

This command creates:

  1. A defs.yaml in the transform/ directory with the component configuration.
  2. Automatic manifest compilation and caching — the component handles prepare_if_dev() internally.
  3. Asset generation from your dbt manifest, identical to what the @dbt_assets decorator produces.

For new projects starting in 2025+, this is the recommended setup path. You get the full dbt-to-Dagster asset mapping with less friction.

DbtProjectComponent in Detail

The DbtProjectComponent is the flagship component and the one most analytics engineers will encounter first. It wraps the entire dbt integration:

type: dagster_dbt.DbtProjectComponent
params:
project_dir: ./transform
dbt:
target: prod
profiles_dir: ./transform
asset_attributes:
group_name: dbt_models
schedule:
cron: "0 6 * * *"
selection: "tag:daily"

This single YAML block:

  • Points at your dbt project directory
  • Configures the DbtCliResource with target and profile settings
  • Sets a default group name for all generated assets
  • Creates a schedule that runs daily-tagged models at 6 AM

The equivalent Python code would span 20-30 lines across multiple files. For teams that want the mapping without heavy customization, the YAML approach is faster to set up and easier to maintain.

Beyond dbt

Components aren’t limited to dbt. The architecture is extensible — any common pattern that generates assets from configuration can become a component:

  • Fivetran syncs as components that generate one asset per connector
  • dlt pipelines as components that generate assets from pipeline definitions
  • Python scripts as lightweight components with YAML-defined inputs and outputs

The component model follows the same philosophy as dbt itself: convention over configuration, YAML for the common case, escape to code when you need customization. If your pipeline is standard enough to describe in YAML, Components save you from writing Python. When you need custom logic — a DagsterDbtTranslator override, a complex sensor, a Python processing step — you drop down to the traditional decorator-based approach described in the SDAs note.

Components vs. Traditional Setup

AspectComponents (YAML)Traditional (Python decorators)
Entry barrierLow — write YAMLHigher — write Python
CustomizationLimited to component paramsFull Python flexibility
Manifest handlingAutomaticManual (prepare_if_dev())
Schedule definitionYAML paramsPython build_schedule_from_dbt_selection
Translator customizationNot directly supportedFull DagsterDbtTranslator override
Best forNew projects, SQL-first teamsExisting projects, teams needing custom behavior

The trade-off is flexibility for simplicity. Components are opinionated by design — they make the common case easy and the edge case harder. If you need to filter which models become assets, customize how asset keys are derived from model names, or inject custom metadata from dbt’s meta config, the traditional Python approach gives you full control.

For new projects:

  1. Start with Componentsdg scaffold gets a working project running with minimal code.
  2. Add Python customization when Components’ YAML parameters don’t cover the need. The two approaches coexist: YAML-defined dbt components and Python-defined custom assets can share the same Definitions.
  3. Use the traditional setup if deep customization is required from the start (custom translators, complex filtering, multi-project configurations).

The learning curve is more gradual with this progression because Python complexity is encountered incrementally.