ServicesAboutNotesContact Get in touch →
EN FR
Note

AI Agent Regulatory Exposure for Data Teams

Why running AI agents against client data creates contractual and regulatory exposure for data teams — GDPR, data processing agreements, the open-source liability argument, and what the Dutch DPA warning actually means.

Planted
aidata engineeringanalytics

Running AI agents against production systems creates two distinct risk domains for data teams: technical security risks and legal/regulatory exposure. The Dutch Data Protection Authority’s February 2026 warning about OpenClaw was a formal statement from an EU member state DPA, not a security researcher’s opinion — a distinction that affects how data practitioners should weigh it.

What the Dutch DPA Actually Said

On February 13, 2026, the Autoriteit Persoonsgegevens (Dutch DPA) issued a formal statement calling OpenClaw a “Trojan Horse” and urging organizations not to install or use it on systems containing:

  • Access codes and passwords
  • Financial records
  • Employee data
  • Private documents
  • Identity documents

That list is not abstract. For analytics engineers and data consultants, it describes the precise categories of data that most client engagements involve. Financial records are the core of most analytics work. Employee data appears in HR pipelines, workforce analytics, and expense reporting. Access codes and passwords are the credentials you use to connect to client warehouses. If your work regularly involves any of these categories — and it likely does — the Dutch DPA was warning about your setup specifically.

The UK’s Information Commissioner’s Office followed on February 25, 2026, with Commissioner John Edwards characterizing agentic AI as “a future concern.” The regulatory posture across EU and UK data protection authorities is convergent: agentic AI with broad data access is an active compliance risk, not a theoretical one.

Data Processing Agreements and Third-Party AI

Most analytics consulting engagements are governed by a data processing agreement (DPA) — a legal document that specifies how the contractor is permitted to process the client’s personal data. DPAs typically enumerate the processing purposes, the technical and organizational measures in place, which subprocessors (third parties) are authorized to receive client data, and the conditions under which data may be transferred.

Running client data through an LLM API creates a subprocessing relationship with the LLM provider. When an OpenClaw agent queries a client warehouse and sends the results to Claude, GPT-4o, or Gemini for processing, the LLM provider receives that data. If the client’s DPA does not explicitly authorize AI processing or name the LLM provider as a permitted subprocessor, that transfer may violate the agreement — regardless of the LLM provider’s own data handling policies.

This is not hypothetical. The data flow looks like this:

Client warehouse → OpenClaw agent → LLM API (Anthropic/OpenAI/Google) → response

The middle step — data from the client warehouse going to the LLM API — is the subprocessing relationship that most DPAs don’t address. Agreements written before the current wave of AI tools will almost certainly not cover this. Agreements written recently may cover it, but only if both parties explicitly negotiated AI processing terms.

The practical question before running any AI agent against client data is: does the governing DPA authorize this data transfer? If you don’t know the answer, the answer is probably no.

The GDPR Compliance Question

GDPR applies to the processing of personal data of EU individuals, regardless of where the processing organization is located. If your client’s data includes personal data of EU residents — which is likely for any organization operating in Europe — GDPR governs how that data is processed.

GDPR requires a lawful basis for each processing activity. Running client data through an LLM API is a processing activity. The lawful bases available (legitimate interest, contractual necessity, consent, and so on) each require specific conditions to be met. For most analytics use cases, the applicable basis would be contractual necessity or legitimate interest — but both require that the processing is actually specified in the agreement with the data subject (the client’s customers) or is genuinely necessary for the contract, not merely convenient.

“I wanted to use an AI tool to make my work easier” is not a GDPR lawful basis. Neither is “the AI tool is open source.”

The Dutch DPA’s statement included an explicit point on this: organizations and users remain responsible for GDPR compliance regardless of whether the AI system is open-source. Open source means you can inspect and audit the code. It does not mean the processing activities conducted with the tool are compliant. It does not transfer liability from the data processor (you) to the software project.

The “Open Source Doesn’t Transfer Liability” Argument

A common response to security and compliance concerns about OpenClaw is that it’s open source, so you can audit it, and you’re not trusting a vendor’s black box. This conflates code transparency with legal responsibility.

The DPA’s point is worth restating precisely: you are responsible for what you do with a tool, not the tool’s author. If you process client personal data through OpenClaw and that processing violates your DPA or GDPR, the exposure is yours — not OpenClaw’s. Open source provides transparency and auditability, both of which are genuinely valuable. It does not create a legal shield for the user.

This argument is sometimes extended further: “the LLM provider processes the data, so any compliance issues are theirs.” This also fails. In a subprocessing chain, the data controller (your client) has a relationship with the data processor (you). You are responsible for ensuring that your subprocessors (LLM providers) meet the required compliance standards and are appropriately authorized in the DPA. The liability chain runs through you, not around you.

Industry-Specific Regulations

Beyond GDPR, several industry-specific regulations impose additional requirements that bear on AI agent use:

HIPAA (US health data): Processing protected health information through a third-party LLM API would require a Business Associate Agreement (BAA) with the LLM provider. Most standard LLM API agreements are not BAAs. Running health data through the standard OpenAI or Anthropic API without a BAA would be a HIPAA violation.

SOX (US public company financial data): Sarbanes-Oxley imposes requirements on the integrity and auditability of financial reporting processes. Running financial data through AI systems that may modify or summarize it without a clear audit trail creates SOX compliance risk.

PCI-DSS (payment card data): Payment card data should never enter an LLM context. PCI-DSS explicitly prohibits unnecessary retention and transmission of cardholder data. If an agent queries tables that include payment data, that data should be masked or excluded from what’s transmitted to the LLM.

For analytics teams, the practical step is to know which regulations govern each client engagement and trace through how AI agent data flows interact with each one before connecting any agent to that client’s systems.

What Compliance-Conscious Use Looks Like

The compliance concerns above don’t mean AI agents are incompatible with regulated data work. They mean that regulatory clarity needs to precede deployment, not follow it.

The sequence that creates legal exposure is: deploy agent → get value → hope compliance catches up. The sequence that creates defensible use is: review DPA and applicable regulations → identify what processing activities the agreement covers → scope agent access to only what’s covered → update DPA if broader use is needed → deploy agent.

In practice, this means:

Audit your DPAs before connecting agents to client systems. Look specifically for clauses covering AI processing, LLM API usage, and subprocessor authorization. If the agreement predates 2024, it almost certainly doesn’t cover this.

Use local models for anything that needs to stay local. Ollama running Llama or DeepSeek locally means no data leaves your machine via API. You lose some capability compared to frontier models, but you eliminate the third-party subprocessing relationship entirely. For data that cannot go to third parties, local inference is the appropriate architecture.

Separate agent access by data sensitivity. The agent that generates summaries from public metrics does not need access to the same schema as the agent that monitors PII tables. Scope access at the service account level — each client gets a dedicated read-only service account with access only to the schemas the agent actually needs.

Document your data flows. GDPR’s accountability principle requires that you can demonstrate compliance, not just claim it. A data flow diagram that shows what the agent accesses, what goes to which LLM API, and on what legal basis, is the beginning of that documentation.

The regulatory landscape for agentic AI is still forming. The Dutch DPA’s warning and the UK ICO’s comment both signal that formal guidance is coming. The direction of regulatory travel is clear; making conservative choices about data access and LLM API usage ahead of formal rules is lower-cost than remediation after the fact.

See Security Posture for AI Agents for the technical access controls that support compliant deployments, and OpenClaw Security Risks — What’s Documented for the specific documented incidents that inform the regulatory response.