← Back to home

Trust

Security posture, roadmap, and how we think about agents that can take real-world actions.

A New Era in Computing Security

For the past 20 years, security models have been built around locking devices and applications down — setting boundaries between inter-process communications, separating internet from local, sandboxing untrusted code. These principles remain important.

But AI agents represent a fundamental shift.

Unlike traditional software that does exactly what code tells it to do, AI agents interpret natural language and make decisions about actions. They blur the boundary between user intent and machine execution. They can be manipulated through language itself.

We understand that with the great utility of a tool like OpenClaw comes great responsibility. Done wrong, an AI agent is a liability. Done right, we can change personal computing for the better.

This security program exists to get it right.

Context

OpenClaw is an AI agent platform. Unlike chatbots that only generate text, OpenClaw agents can:

  • Execute shell commands on the host machine
  • Send messages through WhatsApp, Telegram, Discord, Slack, and other channels
  • Read and write files in the workspace
  • Fetch arbitrary URLs from the internet
  • Schedule automated tasks
  • Access connected services and APIs

This capability is what makes OpenClaw useful. It's also what makes security critical.

AI agents that can take real-world actions introduce risks that traditional software doesn't have:

  1. Prompt injection — Malicious users can craft messages that manipulate the AI into performing unintended actions
  2. Indirect injection — Malicious content in fetched URLs, emails, or documents can hijack agent behavior
  3. Tool abuse — Even without injection, misconfigured agents can cause damage through overly permissive settings
  4. Identity risks — Agents can send messages as you, potentially damaging relationships or reputation

These aren't theoretical. They're documented attack patterns that affect all AI agent systems.

Scope

This security program covers the entire OpenClaw ecosystem. Nothing is out of scope.

Core Platform

  • OpenClaw CLI and Gateway (openclaw)
  • Agent execution engine
  • Tool implementations
  • Channel integrations (WhatsApp, Telegram, Discord, Slack, Signal, etc.)

Applications

  • macOS desktop application
  • iOS mobile application
  • Android mobile application
  • Web interface

Services

  • ClawHub (clawhub.ai) — Skills marketplace and registry
  • Documentation (docs.openclaw.ai)
  • Any hosted infrastructure

Extensions

  • Official extensions (extensions/)
  • Plugin SDK and third-party plugins
  • Skills distributed through ClawHub

People

  • Core maintainers and contributors
  • Security processes and response procedures
  • Supply chain and dependency management

Program Overview

We're establishing a formal security function with four phases:

1

Transparency

Develop threat model openly with community contribution

2

Product Security Roadmap

Define defensive engineering goals and track publicly

3

Code Review

Manual security review of entire codebase

4

Security Triage

Formal process for handling vulnerability reports

Phase 1: Transparency

Goal

Develop and publish our threat model openly, inviting community contribution, so users understand the risks and can make informed decisions about their deployments.

Why

Security through obscurity doesn't work. Attackers already know these techniques — they're documented in academic papers, security blogs, and conference talks. What's missing is clear communication to users about:

  • What risks exist
  • What we're doing about them
  • What users should do to protect themselves

By developing the threat model openly, we benefit from collective expertise and build trust through transparency.

Threat Model Coverage

Category Risks Covered
A. Input Manipulation Direct prompt injection, indirect injection, tool argument injection, context manipulation
B. Auth & Access AllowFrom bypass, privilege escalation, cross-session access, API key exposure
C. Data Security System prompt disclosure, workspace exposure, memory leakage, data exfiltration
D. Infrastructure SSRF, gateway exposure, dependency vulnerabilities, file permissions
E. Operations Logging sensitive data, insufficient monitoring, resource exhaustion, misconfiguration
F. Supply Chain ClawHub skills integrity, extension security, dependency vulnerabilities

Threat Model Scope

Component Why It's Included
Core platform (CLI, Gateway, agents, tools) Primary attack surface
ClawHub (clawhub.ai) Skills marketplace — supply chain risk
Mobile apps (iOS, Android) Agent control interface, credential storage
Desktop app (macOS) Gateway host, system integration
Extensions and plugins Third-party code execution
Build and release pipeline Distribution integrity

Each risk in the threat model will include description and severity rating, attack examples, current mitigations, known gaps, and user recommendations.

The threat model will be open for community contribution via pull requests.

Phase 2: Product Security Roadmap

Goal

Create a public product security roadmap defining defensive engineering goals, tracked as GitHub issues so the community can follow progress, provide input, and contribute.

Defensive Engineering Goals

Category Goal Description
Prompt Injection Protection Input validation Pattern detection and alerting for injection attempts
Tool confirmation Require explicit approval for sensitive actions
Context isolation Prevent cross-session contamination
Privacy Enhancements System prompt protection Prevent disclosure of system prompts
Data minimization Reduce unnecessary data retention
Audit logging Clear visibility into agent actions
Access Control Fine-grained permissions Per-tool, per-session access controls
Rate limiting Prevent resource exhaustion
Spending controls Hard limits on API costs
Supply Chain Skills verification Integrity checks for ClawHub skills
Dependency auditing Automated vulnerability scanning
Signed releases Cryptographic verification of updates

Specific priority issues will be identified through the Phase 3 code review and added to the public roadmap as they are discovered and triaged.

Phase 3: Code Review

Goal

Conduct a comprehensive manual security review of the entire codebase, supplemented by automated tooling where appropriate, to identify vulnerabilities we've missed and validate our threat model.

Scope

The code review covers the entire OpenClaw codebase and ecosystem:

Area Path Why
Agent executionsrc/agents/Core attack surface — how agents run
Tool implementationssrc/agents/tools/What agents can do — exec, messaging, web
Message processingsrc/auto-reply/Entry point for all user input
Security utilitiessrc/security/Existing security controls
Gateway serversrc/gateway/Network-exposed component
Authenticationsrc/*/auth*Credential handling, API keys
Session managementsrc/config/sessions.tsCross-session isolation
Pairing and access controlsrc/pairing/, src/*/access-control*DM and group gating
External content handlingsrc/security/external-content.tsInjection defenses
macOS desktop appapps/macos/Gateway host, system integration
iOS mobile appapps/ios/Agent control, credential storage
Android mobile appapps/android/Agent control, credential storage
ClawHubclawhub.aiSkills registry — supply chain risk
Official extensionsextensions/First-party plugins
Build and release pipelineCI/CD, scriptsDistribution integrity, signing

Approach

  1. Manual code review — Line-by-line analysis of security-critical paths
  2. Automated scanning — Static analysis, dependency auditing, secret detection
  3. Dynamic testing — Attempting documented attack patterns against running system
  4. Architecture review — Evaluating trust boundaries and data flows

Disclosure

  • All critical and high findings fixed before public disclosure
  • Findings summary published after remediation
  • Full report available on request
  • CVEs assigned where applicable

Phase 4: Security Triage Function

Goal

Establish a formal process for receiving, triaging, and responding to security vulnerability reports.

Report a Vulnerability

We take security reports seriously. Complete reports receive a response within 48 hours.

Required in Reports

Title Severity Assessment Impact Affected Component Technical Reproduction Demonstrated Impact Environment Remediation Advice

Reports without reproduction steps, demonstrated impact, and remediation advice will be deprioritized. Given the volume of AI-generated scanner findings, we must ensure we're receiving vetted reports from researchers who understand the issues.

Response SLAs

Severity Definition First Response Triage Fix Target
Critical RCE, auth bypass, mass data exposure 24 hours 48 hours 7 days
High Significant impact, single-user scope 48 hours 5 days 30 days
Medium Limited impact, requires specific conditions 5 days 14 days 90 days
Low Minor issues, defense in depth 14 days 30 days Best effort

Our Commitments

  • Acknowledge all complete reports within 48 hours
  • Provide status updates at least every 14 days
  • Credit researchers in advisories (unless anonymity requested)
  • Not pursue legal action against good-faith security research
  • Consider bounties for qualifying critical/high findings (case-by-case)

Security Advisor

OpenClaw is bringing on Jamieson O'Reilly (@theonejvo) as lead security advisor to guide this program.

Jamieson is the founder of Dvuln, co-founder of Aether AI (the world's most dangerous AI, in your corner), a member of the CREST Advisory Council, and brings extensive experience in offensive security, penetration testing, and security program development.

Responsibilities

  • Lead threat modeling and risk assessment
  • Scope and oversee code review
  • Establish triage process and response procedures
  • Review security-critical code changes
  • Provide guidance on security architecture decisions

Current Security Posture

OpenClaw already has security controls in place. Understanding what exists helps users configure their deployments appropriately.

Secure by Default

DM Policy: Pairing

Unknown senders must complete pairing flow with expiring code

Exec Security: Deny

Commands not on allowlist are denied by default, user prompted for approval

Default AllowFrom: Self-only

If not configured, only your own number can DM the agent

Session Isolation

Conversations are isolated per session key

SSRF Protection

Internal IPs and localhost blocked in web_fetch

Gateway Auth Required

WebSocket connections must authenticate

Verify Your Setup

openclaw security audit --deep

Key items to verify:

  • DM policy is pairing or allowlist (not open)
  • allowFrom is configured for your channels
  • Exec security is not set to full unless intended
  • Gateway is bound to loopback or behind authentication
  • Workspace doesn't contain secrets

Timeline

WEEK 1-2: Phase 1 — Transparency
├── Threat model development begins (open for contribution)
├── Security configuration guide drafted
├── Visual overview created
└── Announcement posted

WEEK 3-4: Phase 2 — Product Security Roadmap
├── GitHub issues created for defensive engineering goals
├── Security label and milestone set up
├── Community input period opens
└── First security work begins

WEEK 5-8: Phase 3 — Code Review Preparation
├── Scope finalized (entire codebase)
├── Review begins
└── Initial findings

WEEK 8-12: Phase 3 — Code Review Execution
├── Manual review completed
├── Findings documented
├── Remediation for critical/high
└── Verification completed

WEEK 8+: Phase 4 — Triage Function
├── security@openclaw.ai live
├── PGP key published
├── Disclosure policy published
└── First advisories (if needed)

ONGOING:
├── Monthly security updates
├── Continuous threat model refinement
├── Regular dependency auditing
└── Community engagement

FAQ

"Is OpenClaw safe to use right now?"

Yes, with proper configuration. OpenClaw has security controls enabled by default:

  • DM Policy: Defaults to pairing — unknown senders must complete a pairing flow with an expiring code
  • Exec Security: Defaults to deny with ask: on-miss — dangerous commands require approval
  • AllowFrom: If not configured, defaults to self-only
  • Gateway Auth: Required by default

Run openclaw security audit --deep to verify your setup. See docs.openclaw.ai/gateway/security

"Why develop the threat model openly?"

These attack techniques are already public knowledge — documented in papers, blogs, and talks. Developing openly benefits from collective expertise, builds trust through transparency, and holds us accountable.

"Why require remediation advice in vulnerability reports?"

We receive reports from automated scanners and AI tools that flag theoretical issues without understanding them. By requiring reporters to propose a fix, we filter out scanner noise, get actionable reports from researchers who understand the issues, and speed up remediation with expert input.

"What about ClawHub?"

ClawHub (clawhub.ai) is in scope for the entire security program — threat model, code review, and ongoing monitoring. Skills are code that runs in your agent's context — supply chain security is critical.

"What about mobile apps and desktop?"

All applications are in scope. The iOS app, Android app, and macOS desktop app will all be covered by the code review and included in the threat model. Nothing is out of scope.

"Can I help?"
  • Contribute to the threat model via pull request
  • Review and comment on security-labeled issues
  • Report vulnerabilities through security@openclaw.ai
  • Help improve security documentation