ServicesAboutNotesContact Get in touch →
EN FR
Note

The AI Production Gap in Data Engineering

Why AI gets you to 80% fast but the remaining 20% — security, compliance, temporal consistency, governance — is where most of the real work lives.

Planted
dbtaidata engineeringdata quality

AI reaches 80% of code completion quickly; the distance between 80% and production-ready is where most of the work lies. For data pipelines, the remaining 20% may be a temporal filter applied inconsistently across five joined tables — invisible until a stakeholder notices the revenue report does not match. Initial demo success creates the illusion of near-completion, but production readiness requires the contextual judgment that the context gap describes.

The Security Gap

Studies found that 48% of AI-generated code suggestions contain vulnerabilities. 25-30% of GitHub Copilot output contains Common Weakness Enumerations (CWEs). These numbers come from general software engineering, but data engineering has its own version of the problem:

PII exposure. An AI generating a SELECT statement might pull all columns from a table that contains email addresses, phone numbers, or SSNs. It doesn’t know your organization’s data classification policy. It doesn’t know that customer_contact_detail contains PII that should never leave the restricted dataset.

Credentials in configuration. AI generating dbt profile configurations, connection strings, or environment setup scripts can hardcode credentials that should be environment variables. This is a well-known antipattern, but AI systems reproduce it because they’ve been trained on code that does it.

SQL injection vectors. Dynamic queries generated by AI — especially those using Jinja macros that interpolate user-provided values — can introduce injection vulnerabilities if the AI doesn’t use proper parameterization or sanitization.

None of these are theoretical. They’re the default failure mode when AI generates code without understanding the security posture of the environment it’s generating for.

The Compliance Dimension

The regulatory pressure on AI-generated code is growing fast and becoming concrete:

  • The EU AI Act becomes fully enforceable on August 2, 2026, with fines up to 7% of global turnover.
  • GDPR fines have reached €5.88 billion cumulative, with €1.2 billion in 2024 alone.
  • Deloitte’s 2025 report found 73% of enterprises cite data privacy and security as their top AI risk.

For data engineering specifically, compliance manifests in decisions that AI can’t make:

An AI that suggests a dependency with a known vulnerability creates a supply chain risk. An AI that generates code violating an internal compliance policy (say, processing EU customer data in a US region) creates a GDPR exposure. An AI that introduces a library under a restrictive license creates legal liability. Each of these requires human judgment — not about whether the code works, but about whether the code is permissible.

This dimension is growing because regulation is catching up to AI adoption. Organizations that treated AI-generated code as equivalent to human-written code for compliance purposes will increasingly find that regulators disagree.

HBR’s Seven Frictions

Harvard Business Review identified seven frictions in AI’s “last mile” to production in a March 2026 analysis. Each friction is a human problem, not a technical one:

  1. Proliferation of pilots that never graduate. AI produces impressive demos. Moving from demo to production requires handling edge cases, error recovery, monitoring, alerting — the unglamorous work that demos skip.

  2. The gap between demo productivity and real work. AI speeds up the greenfield case dramatically. Modifying existing production code — understanding why it was written that way, what downstream systems depend on it, what would break — is slower with or without AI.

  3. Process debt from accumulated workarounds. Every organization has processes that exist because of a specific historical failure. AI doesn’t know the history. It suggests the “clean” solution that was already tried and failed.

  4. Tribal knowledge that resists codification. This is the context gap in organizational form. The knowledge needed to make production decisions lives in people, not documents.

  5. Governance for agentic systems. When an AI agent makes a change to a production pipeline, who approved it? What’s the audit trail? How do you roll back? Traditional change management doesn’t account for autonomous agents.

  6. Architectural complexity. Real systems are interconnected in ways that resist local optimization. Improving one component can degrade another. AI optimizes locally; production readiness requires thinking globally.

  7. The efficiency trap. Optimizing for speed undermines quality. The fastest way to build a pipeline is to skip tests, skip documentation, skip review. AI defaults to the fast path unless constrained by process.

What Bridges the Gap

The production gap is structural — it exists because production requirements are contextual, organizational, and regulatory, not just technical. Better models do not close it.

Testing as a safety net. The dbt Testing Taxonomy provides five mechanisms for catching issues before they reach production. Generic tests catch structural violations. Unit tests validate logic. dbt-expectations catches business rule violations. Model contracts enforce schema stability. Elementary catches anomalies you didn’t think to test for. Together, they form a defense layer that makes AI-generated code safer to deploy.

Review processes adapted for AI output. Code review of AI-generated code needs different emphasis than review of human-written code. Humans make typos and logical errors. AI makes contextual errors — wrong joins, missing filters, inappropriate assumptions. Reviewers need to focus on “does this make sense for our business?” rather than “is the syntax correct?”

Governance that accounts for AI. Change management, approval workflows, and audit trails need to handle the case where code was generated rather than written. Not because AI-generated code is inherently worse, but because the provenance matters for compliance and accountability.

Incremental trust. Start AI in development environments. Move to staging with human review. Promote to production with monitoring. Each step builds confidence that the AI output meets production standards. The organizations that skip steps — going directly from “AI wrote this” to “it’s in production” — are the ones that will encounter the production gap most painfully.

The production gap is a reason to invest in the human processes — testing, review, governance, monitoring — that bridge the gap between demo and production, not a reason to avoid AI tools.