Most AI-written tests optimize for coverage. They assert implementation details, mock away the real risk, and pass even when the product breaks.
WIO is one testing workflow skill with five commands: $wio scan, $wio test, $wio workload, $wio review, and $wio doctor.
Direct skill install:
npx skills add workersio/skillsClaude Code marketplace install:
claude plugin marketplace add workersio/skills
claude plugin install wio@workersio-skillsClaude plugin installs include the WIO skill, plugin hooks, and Markdown
subagents from plugins/wio/agents/.
Codex marketplace add:
codex plugin marketplace add workersio/skillsFor local Codex plugin testing, add this checkout as the marketplace root:
codex plugin marketplace add .Codex plugin installs include the WIO skill and plugin hook config. Codex
custom agents are a separate native surface: they are loaded from
.codex/agents/ in a project or ~/.codex/agents/ for the user.
To enable WIO Codex agents globally:
mkdir -p ~/.codex/agents
cp .codex/agents/wio-*.toml ~/.codex/agents/To enable WIO Codex agents for the current project:
mkdir -p .codex/agents
cp /path/to/wio-skills/.codex/agents/wio-*.toml .codex/agents/To enable the WIO Codex hook config for the current project:
cp /path/to/wio-skills/.codex/hooks.json .codex/hooks.jsonVerify Codex agent files:
find .codex/agents ~/.codex/agents -name 'wio-*.toml' 2>/dev/null| Command | What it does |
|---|---|
$wio scan [target] |
Maps product behavior, existing tests, CI, and risk areas to find the highest-value tests to add next. |
$wio test [target] |
Runs the full loop: discover a bug-prone candidate, pick strategy, write test, validate, review, then keep only if valuable. |
$wio workload [target] |
Generates or implements realistic, replayable user-session/API/CLI/job/load workloads with controlled variance and assertions. |
$wio review [target] |
Reviews a test for customer value, developer-flow value, signal quality, maintainability, and false confidence. |
$wio doctor [target] |
Audits test-suite health: weak assertions, flakes, excessive mocks, broad snapshots, slow feedback, skipped tests, and missing critical behavior coverage. |
WIO includes three focused subagents:
| Subagent | Role |
|---|---|
wio-candidate-scout |
Read-only discovery of high-value test candidates before implementation. |
wio-strategy-critic |
Read-only challenge of the selected strategy before editing tests. |
wio-test-reviewer |
Read-only post-write review that returns KEEP, REDO, or REMOVE. |
The main agent still writes the test. Subagents gather evidence, challenge the strategy, and review value. They do not duplicate reference content and they do not own the workflow.
Claude plugin subagents live in plugins/wio/agents/, which is the official Claude plugin location. Project-local Claude subagents can also be copied to .claude/agents/ when a repo is not using the plugin.
Codex project subagents live in .codex/agents/*.toml, which is the official Codex custom-agent location.
WIO hooks only remind the active agent to validate test changes and apply the WIO value gate. The executable hook logic lives in plugins/wio/skills/wio/scripts/test-review-reminder.py.
Hook config exists in the official locations:
- Claude plugin hook:
plugins/wio/hooks/hooks.json - Claude project hook:
.claude/settings.json - Codex plugin hook:
plugins/wio/hooks/hooks.json - Codex project hook:
.codex/hooks.json
Detailed testing guidance lives only in plugins/wio/skills/wio/references/.
Reference topics include:
| Area | Covers |
|---|---|
| Behavior mapping | Turning product behavior, workflows, APIs, and incidents into test candidates. |
| Risk-based testing | Prioritizing tests by customer impact, likelihood, confidence gap, and cost. |
| Test level selection | Choosing unit, component, integration, contract, E2E, monitoring, or specialized checks. |
| Workload modeling | Building realistic session, traffic, stateful, synthetic, or operator workloads with bounded variance and replay. |
| Oracles and assertions | Designing assertions that fail for real regressions and explain what broke. |
| Test data and fixtures | Setup, isolation, factories, seeds, cleanup, and state management. |
| Mocking and doubles | Preserving fidelity while keeping tests deterministic and fast. |
| Suite health | Finding flakes, weak signal, slow feedback, skipped tests, and CI blind spots. |
| Advanced strategies | Static analysis, security testing, fuzzing, property-based testing, mutation testing, performance testing, resilience testing, and regression selection. |
$wio scan checkout
$wio test billing eligibility regression
$wio workload onboarding session
$wio review tests/billing_eligibility_test.py
$wio doctor API test suite
Use scan when you do not yet know what to test. Use test when you want the whole candidate-strategy-write-review loop. Use workload when interaction across a realistic session is the risk. Use review when a test already exists or has just been written. Use doctor when an existing suite is hard to trust.
A generated or recommended test should answer:
- What user, operator, customer, or API consumer failure does this prevent?
- What production, release, support, debugging, or review risk does it reduce?
- Would it fail for the regression that matters?
- Is the assertion specific enough to diagnose the broken behavior?
- Does the setup preserve the important dependency, state, permission, timing, or data risk?
- Does this belong in local development, PR CI, nightly, release, or production monitoring?
- If this is a workload, is variance bounded, seeded, replayable, and checked by meaningful invariants?
If those answers are weak, the test should be redesigned or removed.
Keep the public surface area small: one skill, wio, with command modes scan, test, workload, review, and doctor.
Detailed testing guidance belongs in plugins/wio/skills/wio/references/, not duplicated inside workflow files, cloud folders, subagents, hooks, or extra skill trees. When adding a reference topic, add both overview.md and tools.md, then link it from plugins/wio/skills/wio/references/index.md.
Host-specific files must stay minimal and point back to WIO:
- Claude Code plugins: shared install packages live in
plugins/wio/with.claude-plugin/plugin.json, root-levelskills/,agents/, andhooks/. - Claude Code marketplace:
.claude-plugin/marketplace.json. - Codex plugins: shared install packages live in
plugins/wio/with.codex-plugin/plugin.json, root-levelskills/, andhooks/. - Codex marketplace:
.agents/plugins/marketplace.json. - Codex subagents: project custom agents live in
.codex/agents/*.tomlper the official Codex subagents docs. - Codex hooks: project hooks can live in
.codex/hooks.jsonper the official Codex hooks docs.
The quality bar is simple: do not accept tests for coverage alone. A test should reduce real user risk, production risk, support load, debugging time, review time, or release risk.
MIT
