A reference translation between Dataform and dbt concepts. Most concepts have direct equivalents. The syntax changes; the intent does not.
Direct Syntax Equivalents
| Dataform | dbt | Notes |
|---|---|---|
${ref("model")} | {{ ref('model') }} | Same DAG-building reference |
config { type: "table" } | {{ config(materialized='table') }} | Materialization declaration |
config { type: "view" } | {{ config(materialized='view') }} | Default in both tools |
config { type: "incremental" } | {{ config(materialized='incremental') }} | See Incremental Models in dbt |
${self()} | {{ this }} | Reference to the current model’s table |
${when(incremental(), ...)} | {% if is_incremental() %} ... {% endif %} | Conditional incremental logic |
.sqlx files | .sql files | File extension |
definitions/ | models/ | Model directory |
includes/ | macros/ | Reusable code directory |
dataform.json | dbt_project.yml | Project configuration |
Source Declarations
This is where the tools start to diverge. Dataform uses JavaScript declaration files. dbt uses YAML source definitions that bundle documentation and freshness checks into a single location.
Dataform:
declare({ database: "my-project", schema: "analytics_123456789", name: "events_*"});dbt:
sources: - name: ga4 database: my-project schema: analytics_123456789 tables: - name: events identifier: "events_*" freshness: warn_after: {count: 24, period: hour} description: "Raw GA4 event export"The dbt version adds freshness monitoring and documentation in the same file. In Dataform, freshness checks require separate implementation. The {{ source('ga4', 'events') }} function in dbt replaces ${ref("analytics_123456789", "events_*")} in Dataform.
Config Block Translation
Dataform’s JavaScript config blocks map to dbt’s Jinja config blocks. The property names change slightly.
Dataform:
config { type: "table", schema: "reporting", assertions: { uniqueKey: ["customer_id"], nonNull: ["customer_id", "email"] }}dbt splits this into a config block and a YAML test file:
{{ config( materialized='table', schema='reporting') }}models: - name: mrt__sales__customers columns: - name: customer_id tests: - unique - not_null - name: email tests: - not_nullThis separation is intentional. dbt treats tests as first-class objects with their own execution and reporting, rather than inline annotations. See dbt Testing Taxonomy for the full testing model.
Incremental Model Translation
The incremental syntax changes significantly. Dataform uses ${when(incremental(), ...)} for conditional blocks. dbt uses {% if is_incremental() %}.
Dataform:
config { type: "incremental", uniqueKey: ["event_id"], updatePartitionFilter: "event_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 3 DAY)"}
SELECT event_id, event_date, event_nameFROM ${ref("base_events")}${when(incremental(), `WHERE event_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 3 DAY)`)}dbt:
{{ config( materialized='incremental', unique_key='event_id', partition_by={ 'field': 'event_date', 'data_type': 'date' }, incremental_strategy='merge') }}
SELECT event_id, event_date, event_nameFROM {{ ref('base__ga4__events') }}{% if is_incremental() %}WHERE event_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 3 DAY){% endif %}Key differences: dbt requires explicit partition_by for BigQuery partitioned tables. The incremental_strategy defaults to merge on BigQuery but making it explicit improves readability. Dataform’s updatePartitionFilter becomes part of your WHERE clause logic directly. See Incremental Models in dbt for strategy details.
Pre/Post Operations to Hooks
Dataform’s pre_operations and post_operations map to dbt’s pre_hook and post_hook:
Dataform:
config { type: "table", pre_operations: ["DELETE FROM ${self()} WHERE date < DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)"]}dbt:
{{ config( materialized='table', pre_hook="DELETE FROM {{ this }} WHERE date < DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)") }}For complex multi-statement operations, dbt supports lists of hooks or custom materializations. See dbt Macros for the macro patterns that replace Dataform’s JavaScript includes.
Directory Structure Mapping
Dataform organizes by source system. dbt conventions organize by transformation layer:
# Dataform # dbtdefinitions/ models/ sources/ base/ staging/ intermediate/ reporting/ marts/includes/ macros/dataform.json dbt_project.ymlThe base/ layer replaces sources/ and early staging/. The intermediate/ layer handles joins and enrichment. The marts/ layer replaces reporting/. See dbt Project Structure and Naming for detailed naming conventions.
Features Without Direct Equivalents
Some capabilities exist in one tool but not the other.
dbt has, Dataform lacks:
- Seeds — CSV files that load as warehouse tables. Useful for mapping tables, test fixtures, and reference data.
- Snapshots — Built-in SCD Type 2 tracking. See SCD Type 2 with dbt Snapshots. In Dataform, you build this manually.
- Package ecosystem — 200+ community packages via the dbt Hub. Dataform has no package system.
- Source freshness checks — Native monitoring of source table staleness.
Dataform has, dbt lacks:
- Native JavaScript — Full programmatic model generation. See JavaScript vs Jinja in Analytics Engineering for the implications.
- Free tier on BigQuery — No licensing cost for BigQuery-only usage.
- Built-in scheduling — Integrated with Google Cloud without external orchestration.
These gaps drive the migration decision more than syntax differences do.