ServicesAboutNotesContact Get in touch →
EN FR
Note

Dataform-to-dbt Concept Mapping

A reference mapping of Dataform concepts to their dbt equivalents — refs, configs, sources, materializations, testing, and directory structure.

Planted
dbtdataformdata engineeringdata modeling

A reference translation between Dataform and dbt concepts. Most concepts have direct equivalents. The syntax changes; the intent does not.

Direct Syntax Equivalents

DataformdbtNotes
${ref("model")}{{ ref('model') }}Same DAG-building reference
config { type: "table" }{{ config(materialized='table') }}Materialization declaration
config { type: "view" }{{ config(materialized='view') }}Default in both tools
config { type: "incremental" }{{ config(materialized='incremental') }}See Incremental Models in dbt
${self()}{{ this }}Reference to the current model’s table
${when(incremental(), ...)}{% if is_incremental() %} ... {% endif %}Conditional incremental logic
.sqlx files.sql filesFile extension
definitions/models/Model directory
includes/macros/Reusable code directory
dataform.jsondbt_project.ymlProject configuration

Source Declarations

This is where the tools start to diverge. Dataform uses JavaScript declaration files. dbt uses YAML source definitions that bundle documentation and freshness checks into a single location.

Dataform:

definitions/sources/ga4.js
declare({
database: "my-project",
schema: "analytics_123456789",
name: "events_*"
});

dbt:

models/base/_sources.yml
sources:
- name: ga4
database: my-project
schema: analytics_123456789
tables:
- name: events
identifier: "events_*"
freshness:
warn_after: {count: 24, period: hour}
description: "Raw GA4 event export"

The dbt version adds freshness monitoring and documentation in the same file. In Dataform, freshness checks require separate implementation. The {{ source('ga4', 'events') }} function in dbt replaces ${ref("analytics_123456789", "events_*")} in Dataform.

Config Block Translation

Dataform’s JavaScript config blocks map to dbt’s Jinja config blocks. The property names change slightly.

Dataform:

config {
type: "table",
schema: "reporting",
assertions: {
uniqueKey: ["customer_id"],
nonNull: ["customer_id", "email"]
}
}

dbt splits this into a config block and a YAML test file:

{{ config(
materialized='table',
schema='reporting'
) }}
models/marts/_marts__models.yml
models:
- name: mrt__sales__customers
columns:
- name: customer_id
tests:
- unique
- not_null
- name: email
tests:
- not_null

This separation is intentional. dbt treats tests as first-class objects with their own execution and reporting, rather than inline annotations. See dbt Testing Taxonomy for the full testing model.

Incremental Model Translation

The incremental syntax changes significantly. Dataform uses ${when(incremental(), ...)} for conditional blocks. dbt uses {% if is_incremental() %}.

Dataform:

config {
type: "incremental",
uniqueKey: ["event_id"],
updatePartitionFilter: "event_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 3 DAY)"
}
SELECT event_id, event_date, event_name
FROM ${ref("base_events")}
${when(incremental(), `WHERE event_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 3 DAY)`)}

dbt:

{{ config(
materialized='incremental',
unique_key='event_id',
partition_by={
'field': 'event_date',
'data_type': 'date'
},
incremental_strategy='merge'
) }}
SELECT event_id, event_date, event_name
FROM {{ ref('base__ga4__events') }}
{% if is_incremental() %}
WHERE event_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 3 DAY)
{% endif %}

Key differences: dbt requires explicit partition_by for BigQuery partitioned tables. The incremental_strategy defaults to merge on BigQuery but making it explicit improves readability. Dataform’s updatePartitionFilter becomes part of your WHERE clause logic directly. See Incremental Models in dbt for strategy details.

Pre/Post Operations to Hooks

Dataform’s pre_operations and post_operations map to dbt’s pre_hook and post_hook:

Dataform:

config {
type: "table",
pre_operations: ["DELETE FROM ${self()} WHERE date < DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)"]
}

dbt:

{{ config(
materialized='table',
pre_hook="DELETE FROM {{ this }} WHERE date < DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)"
) }}

For complex multi-statement operations, dbt supports lists of hooks or custom materializations. See dbt Macros for the macro patterns that replace Dataform’s JavaScript includes.

Directory Structure Mapping

Dataform organizes by source system. dbt conventions organize by transformation layer:

# Dataform # dbt
definitions/ models/
sources/ base/
staging/ intermediate/
reporting/ marts/
includes/ macros/
dataform.json dbt_project.yml

The base/ layer replaces sources/ and early staging/. The intermediate/ layer handles joins and enrichment. The marts/ layer replaces reporting/. See dbt Project Structure and Naming for detailed naming conventions.

Features Without Direct Equivalents

Some capabilities exist in one tool but not the other.

dbt has, Dataform lacks:

  • Seeds — CSV files that load as warehouse tables. Useful for mapping tables, test fixtures, and reference data.
  • Snapshots — Built-in SCD Type 2 tracking. See SCD Type 2 with dbt Snapshots. In Dataform, you build this manually.
  • Package ecosystem — 200+ community packages via the dbt Hub. Dataform has no package system.
  • Source freshness checks — Native monitoring of source table staleness.

Dataform has, dbt lacks:

  • Native JavaScript — Full programmatic model generation. See JavaScript vs Jinja in Analytics Engineering for the implications.
  • Free tier on BigQuery — No licensing cost for BigQuery-only usage.
  • Built-in scheduling — Integrated with Google Cloud without external orchestration.

These gaps drive the migration decision more than syntax differences do.