Skip to content

browser-act/skills

Repository files navigation

BrowserAct Logo

BrowserAct Skills

Discord GitHub Stars License

Website X (Twitter) LinkedIn YouTube


Browser automation CLI built for AI agents. Break through anti-bot walls, hand off to humans across platforms when stuck. Parallel multi-task execution, independent multi-session operation, isolated multi-account browsing.

Why BrowserAct

AI agents need more than a headless Chrome wrapper — they need a complete browser automation platform:

Go where standard tools can't — three-layer anti-blocking, progressively escalating:

  1. Environment layer — stealth fingerprints, TLS rotation, proxy switching. Most scenarios resolved here without triggering any challenge.
  2. Execution layersolve-captcha auto-solves common CAPTCHAs; stealth-extract pulls protected pages in one command, fully unattended.
  3. Human interaction layerremote-assist generates a live URL; user opens it on any device to take over. Once done, agent continues seamlessly.

Concurrent sessions, zero interference:

  • Same-browser multi-session — shared login state, independent execution, tasks don't affect each other (coming soon)
  • Cross-browser multi-session — different browsers operating simultaneously, fully independent
  • Multi-account isolation — each browser has its own fingerprint, proxy, and cookies; websites cannot correlate them

Isolation: independent identity per browser — Each stealth browser is a fully independent identity — independent fingerprint, independent proxy, independent cookies. Websites cannot correlate them. Privacy mode further ensures zero residue between sessions.

Three browser types for different scenarios:

Type Use Case Key Feature
chrome Reuse local Chrome logins Import Profile, run independently
chrome-direct Control your running Chrome via CDP Zero config, full extensions + SSO
stealth Anti-detection browsing Fingerprint spoofing, proxy rotation, batch collection

All three share the same command interface. Learn one, use all.

Designed for agents:

  • Context efficiency — compact text output, consuming fewer tokens than JSON or HTML
  • Index-based operationstate returns indexed interactive elements; agent operates by index directly, no complex DOM parsing
  • Parallel safety — session ownership model + explicit naming, no conflicts between multiple agents
  • Complete capabilities — 50+ commands covering navigation, forms, screenshots, network capture, cookie management

Security: confirmation gating — sensitive operations (creating browsers, deletion, importing Profiles) require explicit user approval. No exceptions. Prior approvals do not carry over. Enforced at the Skill layer, not a configuration toggle.


Install

Tell your AI agent:

Install browser-act Skill from https://github.com/browser-act/skills/tree/main/browser-act

Installation details →


Quick Start

# Extract protected page content (zero config)
browser-act stealth-extract https://example.com

# Full browser automation
browser-act --session my-task browser open <id> https://example.com
browser-act --session my-task state          # See clickable elements
browser-act --session my-task click 3        # Click by index
browser-act --session my-task input 2 "hi"   # Type into field

More examples and workflows →

The agent runs get-skills at the start of each session — gets environment state, browser list, and commands in one call:

browser-act get-skills core --skill-version 2.0.2

How agents discover and use BrowserAct →


Compatibility

OS: Windows, macOS, Linux

Agents: Claude Code · Cursor · VS Code · OpenCode · OpenClaw · Codex · Gemini CLI — works with any agent that can execute shell commands and load Skills.


Documentation

Want to understand how BrowserAct works under the hood? Full documentation covers architecture, commands, sessions, stealth, security, and advanced features.

Read the full documentation →


Also From BrowserAct

Skill Forge — Generate a Skill for Any Website

Need to extract data from the same website repeatedly at scale? Don't write scrapers by hand. Skill Forge explores a site once, discovers its APIs and data patterns, generates a deploy-ready Skill package, then runs reliably without re-exploration — 500 or 5,000 records through the same stable path.

Any website. Any data. One command to start:

Install browser-act-skill-forge Skill from https://github.com/browser-act/skills/tree/main/browser-act-skill-forge

Then tell your agent what you need:

"Forge a Skill that extracts job listings from LinkedIn — title, company, salary, URL."

Skill Forge documentation →

Solutions Catalog

30+ pre-built Skills already generated by Skill Forge, ready to install and run. Covers Amazon, Google Maps, YouTube, Reddit, WeChat, Zhihu, and more.

Browse the full Solutions Catalog →

Build Your Own

Can't find what you need above? Generate a custom Skill for any website in minutes — no coding required. Just describe what data you want or what action to perform, and Skill Forge handles the rest.


💖 Support the Project

BrowserAct Skills is free and open source. If it saves you time, please give us a ⭐ Star — it keeps the project alive and helps us ship more skills.

GitHub Stars

🎁 Bonus: Once you star the repository, you can join our Discord and post in the #claim-500-credits channel to receive 500 free credits!

🤝 Community & Support

Built with ❤️ by the BrowserAct Team

About

Give your AI the power to browse, scrape, and extract structured data from complex websites — with faster execution, lower cost, and more reliable results.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages