Most teams adopt dbt model contracts by retrofitting them onto existing models: the model exists, and someone writes YAML to describe it. An alternative approach for new models intended to serve downstream consumers is to define the contract before writing any SQL.
The API Design Analogy
Software engineers don’t build an API endpoint and then figure out what the response schema should be. They define the interface — what fields the response will have, what types they’ll be, what constraints apply — agree on it with the clients who’ll consume it, and then implement it. The interface definition comes first; the implementation follows.
Data models can work the same way. The question isn’t “what does my SQL produce?” — it’s “what do consumers need?” Defining the contract first means starting with that question and making the SQL satisfy the answer rather than the other way around.
The Contract-First Workflow
Step 1: Agree on the interface. This is a conversation between the team building the model and the teams consuming it. What columns does the consumer need? What types? What constraints? This happens before any code is written. The outcome is agreement, not YAML.
Step 2: Write the YAML contract. Translate the agreement into a dbt model definition with contract: {enforced: true}, column declarations with types, and any constraints:
models: - name: mrt__analytics__customers access: public config: materialized: table contract: enforced: true columns: - name: customer__id data_type: integer constraints: - type: not_null - type: primary_key - name: customer__name data_type: text - name: customer__lifetime_value data_type: numeric(38,2) - name: customer__segment data_type: text - name: customer__is_active data_type: booleanStep 3: Implement the model. Write SQL that satisfies the contract. If your query returns a column with the wrong type, or returns columns the YAML doesn’t declare, dbt compile tells you immediately:
Compilation Error in model mrt__analytics__customers This model has an enforced contract that failed.
| column_name | definition_type | contract_type | mismatch_reason || ------------------------ | --------------- | ------------- | ------------------ || customer__lifetime_value | TEXT | NUMERIC(38,2) | data type mismatch |The compile step is the feedback loop. You’re not running the full build and checking test output — you’re compiling, seeing exactly what doesn’t match, and fixing it. CI validates the rest.
Step 4: Mark the contract as the canonical interface. Once the model builds cleanly, set access: public and communicate the version to consumers. They can start referencing ref('your_project', 'mrt__analytics__customers') knowing the contract guarantees the shape.
Why This Order Matters
The retrofit approach — write the model, add the contract to describe it — treats the contract as documentation of what already exists. The contract reflects implementation decisions made without consumer input.
Contract-first inverts the dependency. Consumers express their needs. The contract captures those needs. The SQL exists to satisfy the contract. This is a meaningful difference in practice:
- Columns that consumers don’t need don’t end up in the contract, which keeps marts focused
- Type decisions are made consciously rather than defaulting to whatever SQL returns
- Breaking changes require explicit renegotiation, because the contract was the original agreement
- Multiple downstream consumers can review the proposed YAML and flag problems before implementation begins
The friction cost is upfront coordination, which feels slower. The payoff is that the model’s interface is deliberately designed rather than accidentally documented.
ODCS + Data Contract CLI Integration
The Open Data Contract Standard (ODCS) takes this further. You can define a contract using the ODCS YAML specification — which captures SLAs, ownership metadata, and quality rules beyond what dbt handles — and then generate dbt model YAML from it using the Data Contract CLI:
datacontract export --format dbtThe generated dbt YAML is a subset of the broader contract, focused on the structural guarantees dbt can enforce at compile time. The broader ODCS contract lives as a standalone document covering the full data product agreement: who owns it, what SLAs apply, what quality rules are embedded.
This bridges two levels of contract:
- Organizational level: the ODCS contract, agreed to by producers and consumers, covering the full data product lifecycle
- Technical level: the dbt model contract, enforced at compile time, covering structural guarantees
The workflow: define the ODCS contract (with or without tooling, a YAML file is enough), run datacontract export --format dbt to generate the initial column declarations, then add the implementation SQL. Changes to the organizational contract can be re-exported to keep the dbt YAML in sync.
For teams that don’t need ODCS’s organizational metadata yet, the contract-first approach doesn’t require it. You just write the YAML yourself, agree on it with consumers before writing SQL, and use dbt compile as the implementation feedback loop.
Practical Starting Point
The contract-first approach is most valuable for:
- New models that will be
access: public - Models in a dbt Mesh setup where cross-project consumers exist from day one
- High-stakes tables (executive dashboards, partner data feeds) where interface stability matters immediately
For internal models that nobody else depends on, the overhead of formal contract-first design isn’t justified. The governance investment scales with how many consumers depend on the interface.
The rollout strategy for existing models describes how to retrofit contracts on already-built models. Contract-first is the forward-looking version of that work — the approach that makes retrofitting unnecessary.