ServicesAboutNotesContact Get in touch →
EN FR
Note

Data Contract Anti-Patterns

Where data contract initiatives go wrong: misplaced enforcement, paper-only contracts, one-size-fits-all implementations, and unfunded ownership.

Planted
dbtdata qualitydata engineering

This note covers structural anti-patterns that cause data contract initiatives to fail, distinct from the organizational adoption challenge. The most common failure mode: dbt contracts exist on mart models owned by the data team, but no other teams adopted them, and models owned by others still break without warning.

Contracts in the wrong place

Chad Sanderson and Mark Freeman argue in Data Contracts: Developing Production-Grade Pipelines at Scale (O’Reilly, 2025) that “most data teams are placing their contracts in exactly the wrong spot.” Data teams implement contracts on the models they control, because that’s where they have authority. But contracts on mart models only catch problems after bad data has already entered the warehouse. They’re a safety net, not a prevention mechanism.

Sifflet’s Salma Bakouk puts the outcome bluntly: “After four years of widespread adoption, we’re left with YAML files that go stale, schema definitions that drift from business logic, and teams that treat contracts like documentation rather than enforceable agreements.”

The ideal enforcement point is as close to the data source as possible. For event streams, that means a schema registry. For batch pipelines, that means schema contract modes in your EL tool. For everything else, a post-load validation layer like Soda fills the gap before dbt even runs. dbt contracts at the mart layer are still valuable, but they’re the last line of defense, not the first.

Sanderson’s four failure layers

Sanderson breaks contract failure into four layers. Each one compounds the others.

Contracts are socio-technical, so they fail when treated as a purely technical concern. You can have the perfect YAML specification and the most thorough CI validation, but if the team that produces the data doesn’t understand why they should care, the contract is a document that one side wrote and the other ignores. The ownership model determines whether contracts create accountability or just paperwork.

Adoption friction kills contracts quietly. Without obsessive attention to onboarding ease, support fades. If adopting contracts means engineers need to learn a new tool, write unfamiliar YAML, and add a deployment step, it won’t happen no matter how compelling the pitch. The teams that succeed make contract adoption feel like a natural extension of existing workflows, not a new process bolted on top.

Tech sprawl compounds the problem. Each new tool or database adds integration complexity. A contract that works for Kafka events doesn’t map cleanly to batch BigQuery extracts. A validation framework that covers Snowflake may not support Databricks. The more heterogeneous the stack, the harder it is to maintain a single coherent contract practice.

Versioning gets overlooked. Contracts need to align with the right historical deployments, not just the current state. If your contract says “this event has five fields” but the data you loaded yesterday was produced by a version of the service that only had four fields, you’ll generate false violations against historical data. Version management across contracts, schemas, and deployment history is operationally complex and rarely addressed upfront.

The recurring anti-patterns

These specific failure modes show up across organizations:

Paper-only contracts live in wikis or Confluence pages but nothing enforces them. Someone writes a description of the expected schema, maybe with a table of column names and types. It looks like governance. It provides no actual protection. The first time a schema change happens, nobody checks the wiki. Six months later the documentation describes a dataset that no longer exists in that form.

One-size-fits-all implementations force a single contract format across fundamentally different data patterns. Event schemas have different characteristics from batch-heavy domains. Streaming data from Kafka needs schema registry enforcement. SaaS exports from Salesforce can’t have contracts at all because you don’t control the source. Trying to apply the same contract template to all three creates friction for every team without fitting any of them well.

Unfunded ownership gives teams contract responsibility without the budget to maintain what they own. A software engineering team gets told “you now own the contract for your payments events” but nobody adjusts their sprint capacity or on-call rotation. The contract is an unfunded mandate, and unfunded mandates get the attention they deserve: none.

Boil-the-ocean scope tries to put contracts on every dataset at once. This is the “organization-wide rollout” that almost never succeeds on the first attempt. The better approach is starting with two high-impact datasets, demonstrating value, and expanding from demonstrated success rather than top-down mandate.

The stale contract problem

The worst outcome is contracts that exist but don’t reflect reality. A team writes contracts during an initial push, maintains them for a few months, then stops updating them as the underlying data evolves. New columns appear that aren’t in the contract. Type changes happen without contract updates. The contract says the data has certain quality guarantees that nobody is actually checking.

Stale contracts are worse than no contracts. At least without contracts, people know they need to be careful. Stale contracts create false confidence. Someone trusts the documented schema, builds a model against it, and discovers too late that the contract hasn’t matched the actual data for six months.

The prevention is enforcement. When a contract is validated in CI/CD, it can’t go stale because any drift between the contract and the actual data breaks the pipeline. The investment in enforcement infrastructure is what separates contracts that last from contracts that decay. This is the same gap between contract-as-documentation and contract-as-enforcement that determines whether the whole initiative succeeds.