OpenClaw for Data People

Imagine this, your phone buzzed at 7:02 AM. Not a PagerDuty alert. A WhatsApp message from an AI agent you’d set up the night before, telling you that three dbt tests had failed overnight and here’s what it thinks went wrong. One was a freshness check on a source table that hadn’t loaded. The other two were related downstream failures. The agent had already identified the root cause and was asking if you wanted it to notify the client.

That’s the promise of OpenClaw. An autonomous AI agent that lives on your hardware, connects to your messaging apps, and does things without waiting for you to ask. But what’s the reality for data people, and how does it fit alongside Claude Code and Cursor?

What OpenClaw actually is (and isn’t)

Skip past the chatbot comparisons, the copilot framing, and the coding tool category. OpenClaw is an autonomous agent that runs 24/7 on your own hardware, connecting large language models to the messaging apps and CLI tools you already use.

The core of OpenClaw is the Gateway, a single Node.js daemon that runs as a background service on your machine. Think of it as a control plane that sits between your LLMs and 15+ messaging channels: WhatsApp, Slack, Telegram, Discord, Signal, Microsoft Teams, and more. You message it like a colleague. It messages you back.

Three things set it apart from the AI tools most data people are already using.

It’s model-agnostic. Bring your own keys for Claude, GPT, DeepSeek, Gemini, or run Llama locally through Ollama. You’re not locked into any single provider. Switch models based on task, cost, or preference.

It’s proactive. This is the big one. OpenClaw doesn’t wait for you to type a prompt. A configurable heartbeat system reads a HEARTBEAT.md checklist every 30 minutes and autonomously decides whether to act. Pair that with a built-in cron scheduler and you have an agent that can trigger dbt runs, check pipeline status, and send you a summary before your first coffee.

Everything is plain text. Configuration, conversation history, and memory are stored locally as Markdown and YAML files. You can inspect them, grep them, and back them up with Git. For data people who think in version-controlled config files, this feels natural.

OpenClaw is MIT-licensed, completely free, and self-hosted. Your data stays on your machine. The only ongoing cost is the API usage for whichever LLM you connect.

How it got here so fast

OpenClaw was created by Peter Steinberger, an Austrian developer best known for founding PSPDFKit (now Nutrient SDK), a document SDK used by Apple and Dropbox across over a billion devices. After selling the company and taking a break from tech, he came back and started experimenting with AI.

He built the first prototype in about an hour by connecting WhatsApp to Claude. It was his 43rd side project.

This one took off. The project hit 100,000 GitHub stars in under two weeks, making it the fastest-growing open-source project in GitHub history. As of late February 2026, it sits at around 196,000 stars with over 33,000 forks.

The growth was partly fueled by Moltbook, an AI-only social network that launched around the same time and created a viral loop. At its peak, the repo was gaining 710 stars per hour. The first in-person meetup, ClawCon, drew over 1,000 registrants to San Francisco’s Frontier Tower.

If you’re wondering about the name: it started as “Clawdbot” (a pun on Claude), got renamed to “Moltbot” after Anthropic filed a trademark complaint, then landed on “OpenClaw” (lobsters molt; “claw” preserves the lobster heritage; “open” for open source). The mascot is a lobster, and the community leans into it.

In February 2026, Steinberger announced he was joining OpenAI to “drive the next generation of personal agents.” Sam Altman reportedly called him “a genius with a lot of amazing ideas.” The project moved to an independent open-source foundation, with OpenAI as a backer.

The ecosystem backs it up: thousands of community-built skills on ClawHub, 20 repositories under the OpenClaw organization, and a Product Hunt category of its own. This isn’t a weekend experiment that will disappear in three months.

But the security researchers are paying very close attention, too. More on that shortly.

Why data people should care

For analytics engineers, the interesting part is straightforward: cron jobs + shell access + messaging = pipeline monitoring you don’t have to build from scratch.

OpenClaw can run any CLI command. That means dbt test, dbt run, bq query, snowsql. The built-in cron scheduler supports standard cron expressions with timezone support, and jobs persist across restarts. You can set up a cron job that runs dbt test at 7 AM every day, parses the output, and sends a formatted summary to a specific Slack channel. The command looks like this:

openclaw cron add \
  --name "dbt test monitor" \
  --cron "0 7 * * *" \
  --tz "America/Los_Angeles" \
  --session isolated \
  --message "Run dbt test and summarize any failures." \
  --announce \
  --channel slack \
  --to "channel:C1234567890"

Two execution modes are available: “main session” (runs in the agent’s primary conversation context) or “isolated” (spins up a dedicated session). For data pipeline monitoring, isolated mode keeps things clean.

Browser automation is another angle. OpenClaw has full Chrome DevTools Protocol control: navigate pages, fill forms, take screenshots, handle cookies. If you’re an agency managing client dashboards that don’t have APIs, you can schedule the agent to scrape data tables, compare them against yesterday’s snapshot, and alert on changes. Fragile? Yes. But sometimes you work with what you have.

Persistent memory is where OpenClaw differs most from session-based tools. Claude Code resets between sessions (using CLAUDE.md for continuity). OpenClaw remembers context across days and weeks via Markdown-based memory files. If a test fails on Monday and again on Thursday, the agent can tell you it’s a recurring pattern without you having to explain the history.

ClawData deserves a mention. It’s a community project by Sean Preusse that builds data-engineering-specific skills for OpenClaw: dbt, DuckDB, Snowflake, Airflow. It ships with a medallion-architecture dbt project and a web dashboard called Mission Control. Fair warning: it’s extremely early (4 commits, 2 GitHub stars as of late February 2026). No independent reviews exist. I’ll cover it in detail in the next article in this series, but for now, file it under “worth watching, not yet proven.”

The framing that sticks with people using OpenClaw daily is the AI coworker angle. You message it from your phone like you’d message a teammate. “Hey, can you check if yesterday’s GA4 export loaded?” is a WhatsApp message, not a terminal command. For consultants juggling multiple clients from their phones, it can be quite helpful.

How it compares to the tools you already use

If you’re already using Claude Code or Cursor for data work, OpenClaw isn’t trying to replace them. They solve different problems.

	OpenClaw	Claude Code	Cursor
Always on?	Yes (daemon)	No (session)	No (IDE)
Interface	Messaging apps	Terminal	IDE
Best for	Automation, monitoring, non-coding tasks	Deep coding, refactoring, dbt development	Real-time code assistance, exploration
Memory	Persistent (weeks)	Resets (uses CLAUDE.md or skills)	Session-based (can use rules)
Model lock-in	None (BYOK)	Anthropic only	Multiple providers
Security posture	Significant concerns	Enterprise-grade	Standard IDE

Claude Code is better at complex code refactoring, multi-file dbt development, and iterative debugging. It has mature Skills and MCP integrations designed specifically for analytics work. If you’re building new models end-to-end or restructuring a dbt project, Claude Code is the right tool.

Cursor gives you real-time code completions and inline edits while you write SQL. With the dbt Power User extension, it has project-aware assistance with access to your dbt graph, model SQL, and lineage. Best for exploratory coding and quick edits.

OpenClaw fills the space that neither of those touch: the always-on layer that works when you’re not at your desk, handling scheduled tasks, background monitoring, and non-coding automation like email triage and calendar management.

The practitioners getting the most value run all three. OpenClaw can actually trigger and manage Claude Code sessions, making it a layer above rather than a direct competitor.

The elephant in the room: security

I can’t write honestly about OpenClaw without spending significant time on security. The concerns are not hypothetical, and for data teams handling client data, they’re disqualifying in certain contexts.

CrowdStrike released enterprise detection and removal tools. A full detection, monitoring, and removal capability across their Falcon platform: SIEM rules monitoring DNS requests to openclaw.ai, endpoint inventory scanning for OpenClaw installations, an automated removal content pack, and SOAR-based response workflows. They found over 135,000 publicly exposed OpenClaw instances, many running over unencrypted HTTP.

The Dutch Data Protection Authority called it a “Trojan Horse.” In an official statement from February 12, 2026, the Autoriteit Persoonsgegevens urged users and organizations not to use OpenClaw on systems containing privacy-sensitive or confidential data: access codes, financial records, employee data, private documents. They estimated that roughly 20% of community plugins contain malware.

A Meta AI researcher’s inbox was bulk-deleted by a runaway agent. Summer Yue asked her OpenClaw agent to help manage her email. The agent started deleting messages in what she described as a “speed run,” ignoring her stop commands sent from her phone. She had to physically run to her Mac Mini to stop it. The root cause appears to be context window compaction: when the conversation grew too large, the agent summarized it and lost her “stop” instruction in the process. TechCrunch reported this story but noted they couldn’t independently verify the full extent of the deletion.

An initial security audit found 512 vulnerabilities, 8 classified as critical. A one-click remote code execution flaw (CVE-2026-25253) allowed attackers to hijack OpenClaw instances through a malicious link. A separate WebSocket vulnerability discovered by Oasis Security allowed any website to silently take full control of a developer’s agent with no user interaction required.

Infostealers are already targeting OpenClaw config files. API keys, OAuth tokens, and credentials are stored in plaintext Markdown and JSON files within ~/.openclaw/. Malware families like RedLine, Lumma, and Vidar have added these file paths to their target lists. Hudson Rock documented the first in-the-wild exfiltration of a complete OpenClaw configuration.

For analytics teams, the implications are specific. Snowflake credentials, BigQuery service account keys, dbt Cloud tokens: all stored in files that attackers are actively looking for. Any untrusted input the agent processes (emails, Slack messages from external parties, web content) could be weaponized through prompt injection to exfiltrate data. Simon Willison calls OpenClaw’s combination of private data access, untrusted content exposure, and external communication ability the “lethal trifecta” for prompt injection.

Palo Alto Networks mapped OpenClaw to every category in the OWASP Top 10 for Agentic Applications. Microsoft’s security team recommended using it only in isolated environments, not on standard personal or enterprise workstations. Gartner called it “an unacceptable cybersecurity liability” and recommended enterprises block it immediately.

This doesn’t mean “never use it.” It means: understand what you’re granting access to, isolate it from production credentials, and don’t run it on machines that handle sensitive client data. Users who are getting value from it have built explicit guardrails around what the agent can access and do. Those who’ve been burned tended to trust it with everything on day one.

Getting started

If you want to try OpenClaw for yourself, getting started takes about an hour.

Hardware. Your existing Mac or laptop works fine. The community’s favorite dedicated setup is a Mac Mini (~~$599), but a Raspberry Pi (~~$75) also works. You need Node.js 22 or later.

Cost. The software is free. API costs for the LLM typically run $5-50 per month depending on volume. A completely free stack is possible using Oracle Cloud’s free tier, Gemini Flash-Lite’s free tier, and Ollama for local models.

First setup. Pick one messaging channel (WhatsApp or Telegram are the most common starting points), connect one model (Claude or GPT), and try one automation. Don’t try to build a full monitoring stack on day one.

Recommended first project for data people: a morning briefing cron job. Set up a 7 AM cron that pulls yesterday’s pipeline run status, checks for failed dbt tests, and sends a summary to your Slack or Telegram. This gives you a feel for the cron system, shell execution, and messaging integration without touching anything sensitive. Once that works, you can decide if you want to go deeper.

What’s coming next

This is a technology that 196,000 developers have starred in under a month. It’s also one that CrowdStrike built removal tools for. Both of those things are true at the same time, and that tension is exactly why it’s worth understanding.