ServicesAboutNotesContact Get in touch →
EN FR
Note

Dataform-to-dbt Migration

Migration paths between Dataform and dbt — tooling, realistic timelines by project size, and why macro conversion is where migrations get painful

Planted
dataformdbtbigquerydata engineeringdata modeling

Migration between Dataform and dbt is possible in both directions, but the effort scales non-linearly with project complexity. Simple model conversion is largely automatable. Macro and JavaScript logic conversion is where projects stall.

Available Migration Tools

dbt to Dataform: ra_dbt_to_dataform converts dbt projects to Dataform format. It uses GPT-4 for complex macro translation — an acknowledgment that Jinja-to-JavaScript conversion is too irregular for deterministic tooling. The LLM handles the edge cases where Jinja patterns have no direct JavaScript equivalent.

Dataform to dbt: dataform-to-dbt provides the reverse path. SQLX config blocks map to dbt YAML configurations. JavaScript ${ref()} calls become Jinja {{ ref() }}. The structural conversion is straightforward for standard models.

Both tools handle the easy 80% — model definitions, basic configurations, dependency declarations. Neither fully automates the hard 20% — custom logic, complex templating, and project-specific patterns.

Realistic Timelines

Migration duration varies dramatically by what is in the project, not just how many models it contains:

Project SizeExpected TimelinePrimary Effort
Small (~20 models)1-2 weeksMostly automated conversion, manual review
Medium (~50-100 models)2-4 weeksMacro and JavaScript conversion
Large (100+ models)2-3 monthsFull rewrite of programmatic logic
Enterprise with validation3-6 monthsParallel running, stakeholder sign-off

The jump from “medium” to “large” is not proportional to model count. It reflects the complexity of custom logic that accumulates in mature projects. A 200-model project with simple SELECT statements migrates faster than a 50-model project with extensive JavaScript code generation or deeply nested Jinja macros.

Where Migrations Get Painful

Macro and Templating Conversion

This is the dominant cost in any non-trivial migration. Every custom Jinja {% macro %} block needs manual rewriting as JavaScript (or vice versa). The two templating systems are conceptually similar — both generate SQL at compile time — but syntactically and structurally different in ways that resist automated translation.

Jinja macros operate on string concatenation within SQL files. They feel like SQL with logic injected:

{% macro cents_to_dollars(column_name, scale=2) %}
ROUND({{ column_name }} / 100.0, {{ scale }})
{% endmacro %}

JavaScript in Dataform operates on programmatic construction. The same logic looks fundamentally different:

includes/utils.js
function centsToDollars(columnName, scale = 2) {
return `ROUND(${columnName} / 100.0, ${scale})`;
}
module.exports = { centsToDollars };

Simple utility macros translate cleanly. The pain starts with:

  • Macros that query the database — dbt’s run_query() has different execution semantics than Dataform’s compilation model
  • Macros using dbt context objectsthis, target, model, graph have no direct Dataform equivalents
  • Dispatch patterns — cross-database macros using adapter.dispatch() are meaningless in Dataform’s single-platform world, but the logic they encode still needs to exist
  • Package macros — any macro from dbt-utils, dbt-expectations, or other packages must be reimplemented or replaced with custom assertions

For projects that rely heavily on shared macros (a common pattern in mature dbt projects with 20-30 custom macros), budget 40-60% of total migration time for macro conversion alone.

Parallel Running

Enterprise migrations require parallel running: both tools executing the same transformations against production data, with results compared for equivalence. This validation period typically runs 2-8 weeks depending on data refresh cadence and stakeholder risk tolerance.

Parallel running catches:

  • Logic differences introduced during conversion (rounding behavior, null handling, join semantics)
  • Ordering differences that surface in downstream reports (row order changes breaking BI tools with hardcoded assumptions)
  • Timing differences when incremental models process different data windows during the transition

The dbt-audit-helper package simplifies comparison queries during parallel running. Dataform has no equivalent utility, so comparison logic must be written manually.

Incremental Model State

Incremental models deserve special attention during migration. Both tools support incremental materialization, but the state tracking mechanisms differ. A model that was incrementally built in one tool cannot resume incrementally in the other — the first run in the new tool must be a full refresh.

For large tables, this means a one-time cost spike during migration. A model that normally processes 20GB incrementally will process 2TB on its first run in the new tool. Plan BigQuery capacity and budget accordingly.

The Two-Year Rule

A practical heuristic for migration decisions: if the total migration cost (engineering time, parallel running, productivity loss during transition) exceeds two years of licensing savings, the migration does not make financial sense.

For a team of 10 engineers on dbt Cloud ($12,000/year licensing), the two-year threshold is $24,000 in total migration cost. If the project has 50+ custom macros, complex CI/CD pipelines, and enterprise validation requirements, the migration cost will likely exceed that threshold. The licensing savings are real but insufficient to justify the disruption.

The calculation flips for the reverse direction. Moving from Dataform to dbt has no licensing savings to fund the migration — it costs money both in migration effort and in new licensing. Teams typically make this move for ecosystem access (packages, CI/CD, IDE tooling) or platform portability rather than cost optimization.