Inspiration

AI coding agents are everywhere now — Claude Code, Cursor, GitHub Copilot, Windsurf. They write code, run terminal commands, and modify files directly on your machine. The problem? They do all of this with your full user permissions. Nothing stops an agent from running rm -rf /, exfiltrating your .env to a remote server, or wiping your git history with git reset --hard. A single hallucination or prompt injection can cause real, irreversible damage.

We noticed that developers had no lightweight, project-level way to define what an agent can and can't do. OS permissions don't help — the agent is the user. Containers are too heavy for everyday coding. We thought: .gitignore solved "which files to track" with a simple config file. Why can't agent security work the same way? That's how AgentWall was born — a dotfile-driven firewall that travels with your repo and protects your system from rogue AI agents.

What it does

AgentWall adds a .agentfirewall/ config directory to your project that enforces security rules through three layers of defense:

  1. Shell hooks intercept every terminal command before it runs. Dangerous commands like rm -rf /, mkfs, chmod -R 777, and curl | sh are blocked instantly.
  2. A filesystem watcher monitors file changes in real time, catching operations that bypass the shell (like Python's os.remove() or Node's fs.unlink()). On violation, it identifies and kills the responsible agent process.
  3. Network egress control blocks outbound connections to dangerous endpoints — cloud metadata services (169.254.169.254), unauthorized hosts — preventing SSRF attacks and data exfiltration.

Every decision (allow, deny, warn) is written to a structured JSON audit log. A web dashboard lets you view your config, browse logs, and stream events live. The firewall supports three modes — enforce (block violations), audit (log only), and off — and ships with built-in presets so setup is a single command: agentfirewall init --preset standard

The Rule Engine pre-compiles all blocklist and allowlist patterns into regexes at startup for fast matching. It evaluates commands, file operations, and network requests, returning a verdict. When the mode is "enforce" and the verdict is DENY, the action is blocked and the Process Killer finds the responsible agent process (Claude, Cursor, Copilot, etc.) using psutil and terminates it — with safety guards to never kill PID 1, itself, or parent processes.

Configuration lives in .agentfirewall/config.yaml (YAML). The web dashboard is built with Flask and supports live log streaming via Server-Sent Events.

Built with: Python · PyYAML · Click · Watchdog · psutil · Flask · Pytest

Challenges we ran into

Our biggest challenge was coverage. Shell hooks only catch commands typed into a terminal, but AI agents can manipulate files through runtime APIs — os.remove() in Python, fs.unlink() in Node — completely bypassing the shell. We had to build a separate filesystem watcher layer and then solve the hard problem of figuring out which process caused a given file event so we could terminate the right one. Without kernel-level APIs like fanotify, we relied on heuristics matching known agent process names and command-line signatures against running processes.

Shell hook installation also had to be idempotent — users source .bashrc multiple times, so we used guard markers (# >>> agentfirewall >>>) to prevent duplicate hooks and ensure clean removal.

Accomplishments that we're proud of

  • One command (agentfirewall init --preset standard) gives you a working firewall with 13+ blocked dangerous commands, protected sensitive paths, and SSRF prevention — all in under 30 seconds
  • Two-layer defense means agents can't bypass protection by using programmatic file APIs instead of shell commands
  • Violations don't just get logged — the offending agent process is automatically identified and terminated
  • The dotfile model means .agentfirewall/ commits to your repo and the whole team gets the same protection automatically

What we learned

No single interception point is enough. Shell hooks miss programmatic file operations, filesystem watchers miss network calls. Real security requires defense in depth — multiple enforcement layers working together to close gaps. We also learned that the "agent permission" problem — AI agents running with full user privileges and zero privilege separation — is a fundamental gap in developer tooling that the industry hasn't seriously addressed yet. The tools exist to build guardrails; they just haven't been assembled into a developer-friendly package until now.

What's next for Agent Wall

  • Packet-level network enforcement with upload size limits to catch bulk data exfiltration
  • A plugin system so the community can contribute and share rule sets for specific frameworks and environments
  • Daemon mode for running the watcher as a managed background service
  • Native integrations with Claude Code, Cursor, and Copilot for tighter, agent-aware policy enforcement
  • Linux fanotify support for exact process attribution — no more heuristics
  • CI/CD integration to enforce agent security policies on AI-generated code before it gets merged

Built With

Share this project:

Updates