ServicesAboutNotesContact Get in touch →
EN FR
Note

Business Cost of Poor Data Quality

The measurable financial and operational impact of data quality failures — industry statistics, high-profile incidents, and why prevention costs a fraction of remediation.

Planted
data qualitydata engineering

The costs of poor data quality are diffuse: wrong decisions, debugging time, and gradually eroding stakeholder trust. Industry numbers make the case for investment in concrete terms.

The Industry Numbers

$12.9 million per year per organization. That’s Gartner’s estimate of the average annual cost of poor data quality. The figure includes direct costs (rework, remediation, fines) and indirect costs (delayed decisions, missed opportunities, eroded stakeholder confidence). For a mid-market company, even a fraction of this number dwarfs the cost of a data quality tooling investment.

57% of analytics engineers’ time goes to maintenance. The 2025 dbt Labs State of Analytics Engineering survey found that practitioners spend 57% of their working hours maintaining and organizing existing datasets rather than building new capabilities. That’s more than half of your most expensive technical talent doing janitorial work because the foundation isn’t trustworthy.

56% cite poor data quality as their top challenge. The same survey found that data quality has risen from 41% in 2022 to 56% in 2025 as the most frequently cited challenge among analytics engineers. The problem is getting worse, not better, as data volumes grow and pipeline complexity increases.

40% of data engineering time on quality issues. A Monte Carlo survey found that data engineers spend 40% of their workdays dealing with data quality problems. For a team of five data engineers at $150K average salary, that’s $300K/year in salary alone spent on reactive firefighting.

Data downtime nearly doubled year-over-year. A 2023 Monte Carlo survey found that data downtime — periods when data is partial, erroneous, missing, or otherwise unusable — nearly doubled compared to the prior year, with time-to-resolution increasing by 166%. The problem compounds: more data, more pipelines, more failure modes, and slower recovery when failures occur.

High-Profile Failures

Three incidents illustrate the range of consequences.

Unity Technologies: $110 Million Revenue Loss

Unity Technologies lost $110 million in Q1 2022 revenue when corrupted data broke their machine learning models used for advertising targeting. The company had ingested faulty data from a major customer, which poisoned the ML models that drove ad targeting. The effects cascaded: targeting accuracy degraded, advertisers got poor results, revenue dropped, and the stock price fell over 30%.

The root cause wasn’t a sophisticated attack or a rare edge case. It was bad data flowing through a pipeline that lacked adequate validation. The kind of problem that volume anomaly detection or distribution checks would have caught early.

JPMorgan Chase: $350 Million in Regulatory Fines

JPMorgan Chase was fined approximately $350 million in 2024 for incomplete trading data in their surveillance systems. The bank failed to report millions of trades accurately over several years. Regulators found gaps in data collection, processing, and reporting that violated multiple regulatory requirements.

In regulated industries, data quality isn’t optional — it’s a compliance requirement with directly measurable penalties. The $350 million fine exceeds the total cost of any data quality infrastructure the bank could have built.

Public Health England: 16,000 Lost COVID Test Results

In October 2020, Public Health England lost 16,000 positive COVID-19 test results when Excel’s row limit silently truncated records. The CSV files exceeded Excel’s maximum of approximately 65,000 rows in the older .xls format being used, and the excess rows were quietly dropped.

No error was raised. No test failed. The data simply disappeared. Contact tracing for those 16,000 positive cases was delayed by days, during a period when rapid contact tracing was critical to pandemic response.

This is the archetypal case for volume anomaly detection: the pipeline appeared to work, every row that existed was valid, but the total volume was drastically wrong. A simple row count comparison between the source CSV and the loaded dataset would have caught the problem immediately.

The Prevention vs. Remediation Asymmetry

Prevention is cheaper than remediation. A test that catches a schema change in CI takes minutes to fix. The same schema change discovered three days later — after dashboards have been reporting wrong numbers and stakeholders have made decisions based on those numbers — takes hours to diagnose, hours to fix, and significant credibility to rebuild.

The minimum viable observability stack — primary key tests, freshness monitoring, volume anomaly detection, and alerting — costs nothing in licensing. Elementary OSS is free. dbt’s native tests are free. The investment is configuration time: a few days of setup that prevents the majority of common data quality incidents.

More sophisticated tooling — Elementary Cloud, Monte Carlo, Anomalo — costs $5K-50K/month depending on scale. Against the backdrop of $12.9 million in annual quality costs, even the most expensive tooling represents a fraction of the problem it addresses.

Making the Business Case

Three arguments for data quality investment:

Time recovery. At 40-57% of engineering time spent on quality issues, a 50% reduction frees 20-28% of total capacity. For a five-person team, that is equivalent to a full-time engineer.

Incident cost avoidance. Track the time spent on the last three data quality incidents — investigation, fix, stakeholder communication, downstream impact — and multiply by annual frequency. The resulting number typically exceeds tooling costs.

Trust preservation. Each time a stakeholder discovers wrong numbers in a dashboard before the data team does, trust erodes. Stakeholders eventually revert to spreadsheets or gut instinct rather than data. Prevention preserves the trust that makes data work useful.