Skip to content

cosinusalpha/webctl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

webctl

Browser automation for AI agents and humans, built on the command line.

# 1. Install
pip install webctl

# 2. Auto-configure your agent (Creates CLAUDE.md, GEMINI.md system prompts)
webctl init

# 3. Start browsing
webctl start
webctl navigate "https://google.com"
webctl snapshot --interactive-only

webctl init automatically generates the system prompts your agent needs to drive the browser.

Why CLI Instead of MCP?

MCP browser tools have a fundamental problem: the server controls what enters your context. With Playwright MCP, every response includes the full accessibility tree plus console messages. After a few page queries, your context window is full.

CLI flips this around: you control what enters context.

# Filter before context
webctl snapshot --interactive-only --limit 30      # Only buttons, links, inputs
webctl snapshot --within "role=main"               # Skip nav, footer, ads

# Pipe through Unix tools
webctl snapshot | grep -i "submit"                 # Find specific elements
webctl --format jsonl snapshot | jq '.data.role'   # Extract with jq

Beyond filtering, CLI gives you:

Capability CLI MCP
Filter output Built-in flags + grep/jq/head Server decides
Debug Run same command as agent Opaque
Cache & Cost webctl snapshot > cache.txt Every call hits server
Script Save to .sh, version control Ephemeral
Human takeover Same commands Different interface

Agent Integration

Step 1: Install

pip install webctl
webctl setup  # Downloads Chromium

Step 2: Generate Prompts Run the init command to create the instruction files (CLAUDE.md, etc) for your agent:

webctl init

Step 3: Add to Config If you aren't using an auto-detecting agent, simply add this to your system prompt:

For web browsing, use webctl CLI. Run webctl agent-prompt for instructions.

Note: If a browser MCP is already configured, disable it to avoid conflicts.


Quick Start (Human Usage)

Verify the installation works by driving it yourself:

webctl start                    # Opens visible browser window
webctl navigate "https://example.com"
webctl snapshot --interactive-only
webctl stop --daemon            # Closes browser and daemon
Global installation with `uv`
uv tool install webctl
uv tool run webctl
Linux system dependencies
playwright install-deps chromium
# Or manually install libraries listed in Playwright documentation

Core Concepts

Sessions

Browser stays open across commands. Cookies persist to disk.

webctl start                    # Visible browser
webctl start --mode unattended  # Headless (invisible)
webctl -s work start            # Named profile (separate cookies)

Element Queries

Semantic targeting based on ARIA roles - stable across CSS refactors:

role=button                     # Any button
role=button name="Submit"       # Exact match
role=button name~="Submit"      # Contains text (preferred)

Output Control

webctl snapshot                                    # Human-readable
webctl --quiet navigate "..."                      # Suppress events
webctl --result-only --format jsonl navigate "..." # Pure JSON

Commands

Navigation & Observation

webctl navigate "https://..."
webctl back / forward / reload
webctl snapshot --interactive-only        # Buttons, links, inputs only
webctl snapshot --within "role=main"      # Scope to container
webctl query "role=button name~=Submit"   # Debug query
webctl screenshot --path shot.png

Interaction

webctl click 'role=button name~="Submit"'
webctl type 'role=textbox name~="Email"' "[email protected]"
webctl type 'role=textbox name~="Search"' "query" --submit
webctl select 'role=combobox name~="Country"' --label "Germany"
webctl check 'role=checkbox name~="Remember"'
webctl press Enter
webctl scroll down

Wait Conditions

webctl wait network-idle
webctl wait 'exists:role=button name~="Continue"'
webctl wait 'url-contains:"/dashboard"'

Session & Console

webctl status                   # Current state & error counts
webctl save                     # Persist cookies now
webctl console --count          # Just counts by level (LLM-friendly)
webctl console --level error    # Filter to errors only

For AI Agents

This section is designed to be read by AI agents directly.

webctl Quick Reference

Control a browser via CLI. Start with webctl start, end with webctl stop --daemon.

Commands:

webctl start                              # Open browser
webctl navigate "URL"                     # Go to URL
webctl snapshot --interactive-only        # See clickable elements
webctl click 'role=button name~="Text"'   # Click element
webctl type 'role=textbox name~="Field"' "text" --submit  # Type + Enter
webctl wait 'exists:role=button name~="..."'              # Wait for element
webctl stop --daemon                      # Close browser

Query syntax:

  • role=button - By ARIA role (button, link, textbox, combobox, checkbox)
  • name~="partial" - Partial match (preferred, more robust)

Tips:

  • Use --interactive-only to reduce output (only buttons, links, inputs)
  • Use name~= for partial matching (handles minor text changes)
  • Use webctl query "..." if element not found - shows suggestions
  • Check webctl status for console error counts before investigating

Architecture

┌─────────────┐     TCP/IPC      ┌─────────────┐
│   CLI       │ ◄──────────────► │   Daemon    │
│  (webctl)   │    JSON-RPC      │  (browser)  │
└─────────────┘                  └─────────────┘
      │                                 │
      ▼                                 ▼
  Agent/User                      Chromium + Playwright
  • CLI: Stateless, sends commands to daemon
  • Daemon: Manages browser, auto-starts on first command
  • Profiles: ~/.local/share/webctl/profiles/

License

MIT

About

Browser automation via CLI — for humans and agents

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages