Data contracts are formal, versioned agreements between data producers and consumers. This hub connects notes that decompose the concept into its constituent parts: what contracts are, the specification behind them, who owns them, how they’re enforced, where they fit in a broader quality strategy, and why adoption remains harder than the technology.
Core Concepts
-
Data Contract Definition — What a data contract is, the “non-consensual API” problem, how contracts differ from schema tests and quality checks, and a brief history from GoCardless to the Linux Foundation.
-
Data Quality Validation Layers — The three-layer model for data quality: proactive contracts, reactive tests, and anomaly detection. Why you need all three and the maturity path for adopting them.
Specification and Tooling
-
Open Data Contract Standard — ODCS v3.1.0 under the Linux Foundation’s Bitol project. What the spec covers (schema, quality rules, SLAs, ownership), how it compares to the Data Contract Specification, and where it complements dbt model contracts.
-
Data Contract Tooling Ecosystem — The landscape in 2026: dedicated contract tools (Data Contract CLI, Gable.ai), quality frameworks with contract support (Soda, dbt-expectations, dbt native contracts), and governance platforms (Atlan, Collibra, DataHub).
Organization and Adoption
-
Data Contract Ownership Models — Producer-defined vs. consumer-defined vs. collaborative contracts. The Convoy insight about visibility. Architecture patterns from GoCardless, Convoy, and PayPal.
-
Data Contract Adoption Challenges — The execution gap between contract-as-documentation and contract-as-enforcement. Why the cultural change matters more than the YAML. A practical adoption path for most organizations.
-
Data Contract Anti-Patterns — Where contract initiatives go wrong structurally: misplaced enforcement, paper-only contracts, one-size-fits-all implementations, unfunded ownership, and the stale contract problem.
-
Data Contract Adoption Friction — Reducing the friction that kills adoption: SDK-based onboarding, audience-specific messaging, post-mortem data as leverage, embedding contracts in engineering KPIs, and the Data Product Manager role.
-
Data Contract Rollout Change Management — The organizational change management strategy: start with two datasets, create urgency through visible cost, measure conversations rather than coverage, and the three phases of success measurement.
dbt Contract Implementation
-
dbt Model Contract Mechanics — How dbt’s native model contracts work: the preflight check, DDL generation, fail-fast behavior, configuration options, and what contracts do and don’t validate.
-
dbt Constraint Enforcement Across Warehouses — How dbt constraint types behave across Postgres, Snowflake, BigQuery, Redshift, and Databricks. Which constraints reject bad data and which are metadata only.
-
dbt Model Versioning — Schema evolution with contracts: breaking vs non-breaking changes, the
state:modifiedselector, version integers, deprecation dates, and the friction points. -
dbt Contract Rollout Strategy — Adopting contracts in an existing project: identifying candidates, scaffolding YAML with dbt-codegen, phased enablement, and CI/CD integration with
--emptybuilds.
Upstream Enforcement
-
Pipeline Enforcement Layer Strategy — The four-layer model for enforcement across the full pipeline: pre-warehouse, post-load, transformation, and continuous observability. Where each tool fits and practical adoption ordering.
-
EL Tool Schema Contract Modes — How dlt, Fivetran, and Airbyte handle schema changes during extraction and loading. dlt’s granular freeze/evolve/discard modes vs. managed tools’ blunt settings.
-
Schema Registry for Contract Enforcement — How schema registries enforce contracts on event streams before data reaches the warehouse. Compatibility modes, CEL validation rules, and production practices.
-
Soda Data Contract Verification — Post-load, pre-transformation contract verification with Soda’s YAML-based contract engine. Filling the gap between EL and dbt.
-
dbt Source Schema Validation — Using dbt-expectations on sources to catch column drift and content changes when contracts can’t reach.
Connected Notes
These existing garden notes intersect with data contracts:
-
dbt Testing Taxonomy — Covers dbt’s native model contracts alongside generic tests, unit tests, and package-based tests. Model contracts are one enforcement point within the broader contract ecosystem.
-
Elementary for dbt — Anomaly detection sits in the third layer of the validation model, catching what contracts and tests both miss.
-
dbt Packages vs Mesh — dbt Mesh uses contracts as the mechanism for cross-project data product sharing. Published models with enforced schemas are, in effect, contracts between teams.
-
Metrics as Code — Metrics defined in version-controlled YAML follow the same “agreements as code” pattern that contracts apply to data structure and quality.
Source Articles
-
Data contracts: a primer for analytics engineers — The pillar article covering fundamentals, history, specifications, tooling, and adoption status.
-
Implementing dbt contract enforcement — Practical guide to dbt’s native contract enforcement, from basic setup through constraint types, schema versioning, and phased rollout.
-
Extending contract enforcement beyond dbt to upstream systems — dbt contracts protect the transformation layer, but bad data enters before dbt runs. Build upstream enforcement with dlt, schema registries, and Soda.
-
The human side of data contracts — Why most contract initiatives stall despite solid tooling. Misplaced enforcement, adoption friction, consumer-defined contracts, and organizational change management.