OpenClaw is a powerful open-source AI agent that runs locally and connects to your chats, files, and terminal, but its security lags far behind its massive success.
OpenClaw is a self-hosted AI agent that runs on your computer or server and works with the messaging apps you already use: WhatsApp, Telegram, Discord, Slack, Microsoft Teams, and even iMessage. It's essentially a local assistant that doesn't just answer questions but also performs actions: reading and modifying files, running shell commands, navigating, managing your calendar, and installing tools.
The project began as a weekend hack by Austrian developer Peter Steinberger, initially released as “Clawdbot” in November 2025, then renamed “Moltbot” and finally “OpenClaw” in early 2026. Unlike cloud assistants, here the data stays where you decide (laptop, homelab, VPS), you retain control over the model’s backend and integrate the agent with the tools you use every day.
OpenClaw has amassed over 180,000 stars on GitHub in just a few weeks, attracted nearly 2 million visitors in a single week, and spawned an ecosystem of thousands of third-party skills. A vibrant, anything-goes community has formed around the project, typical of viral open source projects, with integrations ranging from IoT devices to AI girlfriend projects.
One viral incident involved an engineer granting the agent iMessage access and watching it go berserk, sending over 500 messages to him, his wife, and random contacts. These cases are curious, but they highlight a crucial point: OpenClaw often gains deep access to users' digital lives, while security barriers aren't up to par.
At the end of January 2026, a high severity (CVSS high) vulnerability was disclosed in the OpenClaw control panel. The issue stems from the gatewayUrl parameter in the web interface: the software automatically opened a WebSocket connection to the specified URL and sent the user's authentication token without confirmation.
An attacker could then create a malicious link, have the victim open it (or embed it in a page), intercept the token on a server under their control, connect to OpenClaw's local gateway, disable the sandbox and tool policies, and execute arbitrary commands: one click, total compromise. The vulnerability was fixed in the next build, but it's just one of three high-severity advisories released in a few days, along with additional command injection issues, followed by two more reports on February 4th: five advisories in less than a week, indicating a codebase where security came after features.
OpenClaw's extensibility comes from "skills," plugins distributed through ClawHub, the community marketplace. A Koi Security audit of approximately 2,800 skills identified 341 malicious packages bundled together in a campaign called "ClawHavoc."
These skills disguised themselves as crypto trading bots, productivity tools, or utilities, but in reality distributed infostealers like Atomic Stealer for macOS, credential stealers for Windows, and ClickFix-style social engineering scripts. A fake Polymarket skill, for example, opened a reverse shell to the attacker's server, offering complete remote control over the victim machine.
In subsequent analyses, several security firms reported nearly 900 malicious or dangerously flawed skills on ClawHub, including ClawHavoc campaigns and packages that exposed API keys or secrets in their configurations. The project responded by integrating scanning with VirusTotal and a reporting mechanism for suspicious skills, but the underlying issue remains: ClawHub is effectively an unverified supply chain, and users install third-party code with the same privileges as the agent.
Researchers have identified tens of thousands of exposed OpenClaw instances with insecure default configurations. Some scans found over 135,000 internet-accessible instances, thousands of which could be directly exploited via the RCE that had already been patched but not patched by the operators. These systems exposed API keys, chat history, and account credentials, accessible to anyone who knew where to look.
The main cause is a "split personality" in the default settings: the desktop CLI installation binds correctly to 127.0.0.1, while the official Docker setup binds to 0.0.0.0:18789, making the gateway accessible on all interfaces, including the Internet. Combined with guides that recommend "easy" LAN mode and a user base that prioritizes simplicity over hardening, many servers remain exposed, often running outdated versions and without the latest authentication protections.
It depends on how competent and cautious you are.
For a technically savvy user who understands the permissions they're granting, keeps the software constantly updated, carefully tests the skills before installing them, and runs OpenClaw on a non-critical machine or in an isolated environment (VM, tightly closed container), the risk can be kept under control. In return, you get a truly useful AI agent for automating repetitive tasks on messaging apps and the local system.
But to keep this “manageable risk” you must:
For the "geek" who treats it as a lab project in a well-insulated environment, it could be an interesting experiment. For the average home user who simply wants a "smart" assistant and isn't too concerned about security, it's best to wait for the project to mature.
Bitdefender GravityZone telemetry data shows that many employees are installing OpenClaw directly on company PCs with one-liner commands. This is pure Shadow AI: unmanaged agents with broad system privileges, deployed outside of any IT governance process.
The appeal is obvious: connect the agent to your company's Slack, email, and file system, and you get an assistant that drafts replies, summarizes threads, and automates repetitive tasks. But together, you create a new attack surface that the security team isn't aware of and that monitoring tools can't track.
OpenClaw connects directly to email, files, messaging platforms, and system tools, creating non-human identities and access paths that bypass traditional IAM controls and secrets management. RBAC policies, conditional access rules, or MFA don't apply to an agent that's been issued tokens and API keys and allowed to do its thing.
An agent that processes external data (emails, documents, messages) is intrinsically vulnerable to prompt injection. An attacker who manages to insert a specially crafted message into an employee's inbox can trick the agent into performing actions on their behalf, from stealing data to executing malicious commands. This is a well-documented class of attacks against agent-based systems, and OpenClaw's architecture makes it particularly vulnerable.
With hundreds of malicious or seriously flawed skills already detected on ClawHub, installing a skill is equivalent to running unreviewed third-party code with the same permissions as the agent. In an enterprise context, this is equivalent to allowing employees to install arbitrary software from an unmoderated marketplace, something most companies have spent two decades trying to prevent.
OpenClaw doesn't offer adequate audit logging, granular access controls, integrations with enterprise identity providers, or the governance and compliance tools required in production. There's no central console to manage instances, define roles, enforce policies, or have unified visibility into agent actions.
Even if you've never authorized OpenClaw, chances are someone somewhere on your network has already installed it.
The idea of an open-source, self-hosted AI agent deeply integrated with your tools is very appealing, and the level of automation OpenClaw offers is remarkable for a project that began as a weekend hack. However, critical vulnerabilities, a compromised supply chain, mass-exposed instances, and the lack of enterprise controls make it clear that the technical ambition still doesn't match the security maturity required within the enterprise.
AI agents will become increasingly central, but without adequate controls, their power translates directly into risk. For organizations with data to protect, the conclusion is simple: OpenClaw is not enterprise-ready today; it makes sense to reevaluate it in a few months, when (and if) the security model has reached a functional level.
For home users who are aware and able to manage the trade-offs, the advice is to experiment cautiously: keep it patched, keep it local, and keep it away from anything you can't afford to lose. To better understand the structural problem of prompt injection, you can practice in dedicated labs that simulate real-world attack scenarios on AI agents.