refactor(runtime): extract tool execution into pkg/runtime/toolexec#2545

dgageot · 2026-04-27T15:57:05Z

Goal

Make tool execution easier to read, test, and build on top of, by extracting it from pkg/runtime into a self-contained pkg/runtime/toolexec sub-package — following the existing pkg/runtime/compactor and pkg/runtime/delegator precedents.

pkg/runtime/tool_dispatch.go shrinks from 442 LOC of orchestration to 123 LOC of runtime adapters, and the new package is fully unit-testable without spinning up a runtime.

Public API

Zero changes. runtime.Runtime, runtime.ResumeRequest, runtime.ResumeType, the ResumeType* constants, runtime.LocalRuntime.Resume(...), the Run/RunStream/Steer/FollowUp flow — all byte-for-byte identical. Resume types are now type aliases over toolexec.

Commits (each one builds, lints, and tests cleanly)

#	Commit	What moved
1	`extract pure tool-execution utilities into pkg/runtime/toolexec`	Loop detector + per-tool model override (already pure helpers)
2	`move hooks-input helpers into toolexec`	`parseToolInput`, `newHooksInput` (+ tests)
3	`extract approval-pipeline decision into toolexec.Decide`	The yolo → checkers → read-only → default ladder, as a pure `Decide` function with full unit-test matrix
4	`extract Dispatcher into pkg/runtime/toolexec`	`toolexec.Dispatcher` + `Emitter` / `HookDispatcher` interfaces, with 10 dispatcher unit tests covering routing, approval outcomes, and batch-cancellation synthesis
5	`simplify dispatcher with per-call value type`	Internal cleanup: per-call `*call` value type to cut 7-arg method signatures, plus 4 review fixes (see below)

What lives in `pkg/runtime/toolexec` now

pkg/runtime/toolexec/
├── doc.go                      # explains the package's purpose
├── loop_detector.go            # consecutive-duplicate detector
├── model_override.go           # per-toolset model-override resolution
├── hooks_input.go              # ParseToolInput, NewHooksInput
├── permissions.go              # Decide(yolo, checkers, name, args, readOnly)
├── dispatcher.go               # Dispatcher + Process(ctx, sess, calls, tools, em)
└── *_test.go                   # 30+ unit tests, no runtime needed

Design highlights

Dispatcher carries its dependencies as struct fields: Tracer, Hooks, Resume, AgentFor, Permissions, Handlers. No method-on-LocalRuntime hidden state.
Emitter is a 5-method interface covering only the dispatch-shaped events (ToolCall, ToolCallResponse, ToolCallConfirmation, HookBlocked, MessageAdded). Anything else stays runtime-private.
HookDispatcher is a 2-method interface so the dispatcher doesn't depend on the runtime's hook registry directly.
Approval policy is pure: toolexec.Decide(...) → PermissionDecision{Outcome, Reason, Source}. Every log message the original code emitted is preserved via the Reason field, with no behavior change.
Per-call state lives in a *call value type so helper methods take (ctx) or (ctx, runTool) instead of 7 arguments.
ctx is never stored in a struct — passed explicitly to every method that needs it (idiomatic Go, satisfies containedctx).

Validation

go build ./... ✅
go vet ./... ✅
mise lint (golangci-lint + custom 800-file Go cop + go mod tidy --diff) → 0 issues ✅
mise test → all packages pass, no failures ✅
go test -race -count=1 ./pkg/runtime/... ✅

Reviewing tip

Each commit is independently reviewable. For the cleanest reading order:

Start with commit 1 to see the package's seed (pure utilities).
Skim commits 2–3 for the small, mechanical extractions.
Spend most time on commit 4 (the Dispatcher extraction) — that's where the architectural shape lives.
Commit 5 is internal cleanup of commit 4; the diff is mostly mechanical (per-call struct rather than wide signatures).

Suggested follow-ups (not in this PR)

Move ToolHandlerFunc and r.toolMap registration into toolexec so the runtime stops needing to bind handlers to a chan Event per stream.
Move pkg/runtime/event.go constructors to pkg/runtime/events, then drop the chanEmitter adapter in favour of a direct chan Event implementation of Emitter.

…time/toolexec Mirrors the compactor and delegator extractions: a new sub-package hosts the tool-execution primitives that don't need runtime-private state, leaving runtime to shrink toward pure orchestration. Moved verbatim (renamed to exported APIs, no behaviour change): - tool_loop_detector.go -> toolexec/loop_detector.go toolLoopDetector -> LoopDetector newToolLoopDetector(\u2026) -> NewLoopDetector(\u2026) record/reset/consecutive -> Record/Reset/Consecutive() - tool_model_override.go -> toolexec/model_override.go resolveToolCallModelOverride -> ResolveModelOverride Tests follow each file. loop.go is updated to call the new APIs. Assisted-By: docker-agent

Lift the two pure helpers used to build hooks.Input from a tool call out of pkg/runtime and into pkg/runtime/toolexec, where they belong with the rest of the tool-execution primitives: newHooksInput -> toolexec.NewHooksInput parseToolInput -> toolexec.ParseToolInput Both functions are pure and depend only on pkg/hooks, pkg/session and pkg/tools — packages already imported by toolexec callers — so the move doesn't introduce a new dependency. Tests added for ParseToolInput (empty/invalid/valid/empty-object) and NewHooksInput (field population, invalid-arguments path). tool_dispatch.go is updated to call the toolexec.* APIs; no behaviour change. Assisted-By: docker-agent

…ecide The yolo-then-checkers-then-readonly-then-default ladder used to live as a tangled switch inside (*LocalRuntime).executeWithApproval, mixing the policy decision with three different runtime side effects (run the tool, emit a deny error response, ask the user). Lift the policy out as a pure function: toolexec.Decide(yoloApproved, []NamedChecker, name, args, readOnly) -> PermissionDecision{Outcome, Reason, Source} Outcome = Allow | Deny | Ask (what to do) Reason = Yolo | Checker | ReadOnlyHint | Default (why; lets the caller log the same messages as before) Source = checker label, when Reason == Checker executeWithApproval becomes a flat 3-arm switch that just translates the decision into runtime side effects. permissionChecker (the unexported pair type) is replaced by toolexec.NamedChecker so the runtime adapter permissionCheckers() returns the type Decide consumes directly. ForceAsk now correctly overrides the read-only fast path (it always did, but the new Decide makes the invariant explicit and unit-testable). 12 table tests cover the full matrix in pkg/runtime/toolexec without needing a runtime; runtime tests still pass unchanged. Assisted-By: docker-agent

The 442-line tool dispatch in pkg/runtime/tool_dispatch.go used to be a set of methods on *LocalRuntime that pulled in tracer, hooks, resume channel, agent resolver, permissions, runtime-managed handlers, and the events channel implicitly via the receiver. Extract that orchestration into a self-contained toolexec.Dispatcher with explicit dependencies, so any embedder can drive tool execution without dragging the runtime in. API: type Dispatcher struct { Tracer trace.Tracer Hooks HookDispatcher // 2-method interface Resume <-chan ResumeRequest AgentFor func(*session.Session) *agent.Agent Permissions func(*session.Session) []NamedChecker Handlers map[string]ToolHandler } func (d *Dispatcher) Process(ctx, sess, calls, agentTools, em Emitter) Emitter is a 5-method interface covering the dispatch-shaped events (ToolCall, ToolCallResponse, ToolCallConfirmation, HookBlocked, MessageAdded). Anything else stays a runtime-private concern. Runtime side becomes a thin adapter: pkg/runtime/tool_dispatch.go (~140 LOC, was 442) wires: chanEmitter — chan Event sender for the 5 typed events hookDispatcher — wraps r.dispatchHook + r.executeOnUserInputHooks bindRuntimeHandlers — closes chan Event into ToolHandler shape pkg/runtime/runtime.go aliases the shared types so the public Runtime API is byte-for-byte identical: runtime.ResumeRequest = toolexec.ResumeRequest runtime.ResumeType = toolexec.ResumeType + the four ResumeType* constants 10 dispatcher unit tests cover routing (toolset vs runtime handler), approval outcomes (allow/deny/ask/cancel/reject/approve-tool), read-only auto-approval, and batch-cancellation synthesis. They run without a runtime, with race detection, in <0.1s each. Existing pkg/runtime tests (44 of them, including processToolCalls permission tests, transfer_task validation, yolo-mode bypass, steer race tests) all pass unchanged because processToolCalls remains a method on *LocalRuntime that just constructs a Dispatcher and calls Process. Assisted-By: docker-agent

The previous dispatcher.go threaded 7-8 arguments (ctx, sess, toolCall, tool, em, a, runTool, ...) through every helper, making signatures hard to scan and adding a paper cut every time a new piece of per-call state needed to be plumbed. Bundle that state in a value type and turn the helpers into methods on it. Changes: * Introduce an internal *call value (one per tool invocation) carrying {d, sess, em, a, tc, tool, available}. Most helper signatures shrink to (ctx, runTool) or just ctx — the rest comes from the receiver. * Track tool availability with an explicit 'available bool' instead of the tool.Name == "" sentinel. The synthetic 'unavailable' tool still carries the requested name so error events stay informative. * Rename the public method that wraps tracing/telemetry/event-emission around the actual handler call from execute() to invoke(), so it no longer collides conceptually with the runTool closure parameter that represents "deferred work to perform once approval clears". * Extract two tiny helpers from a 70-line method: - translateError(ctx, span, err) — context-cancel vs error mapping - buildMultiContent(text, images) — image MultiContent assembly * Drop the 3-line permissionsFor() nil-guard helper (inlined at the one caller) and the free logAllow() function (now a method that reads tool/session info from the receiver). * tool_dispatch.go: inline the 7-line bindRuntimeHandlers helper (single caller) and drop newHookDispatcher (struct literal at the one callsite). * runtime.go: remove a dead 'const _ ResumeType' block and an orphaned ResumeRequest doc comment left over from aliasing. Bug fix surfaced during review: Storing context.Context in *call would have been a documented Go anti-pattern (containedctx lint rule) — ctx is passed explicitly to every method that needs it instead. Validation: go build, go vet, go test -race ./pkg/runtime/..., golangci-lint run ./pkg/runtime/... — all clean. The full repository test suite passes unchanged. dispatcher.go grows by 60 lines (mostly comments and the small helpers); the *call refactor cuts roughly 80 argument-passing lines from the rest of the file. Assisted-By: docker-agent

dgageot requested a review from a team as a code owner April 27, 2026 15:57

dgageot force-pushed the board/refactoring-tool-execution-from-runtime-88169cdd branch from 10934f2 to f191a16 Compare April 27, 2026 17:22

rumpl previously approved these changes Apr 28, 2026

View reviewed changes

dgageot added 5 commits April 28, 2026 09:11

dgageot dismissed rumpl’s stale review via 66f4022 April 28, 2026 07:25

dgageot force-pushed the board/refactoring-tool-execution-from-runtime-88169cdd branch from f191a16 to 66f4022 Compare April 28, 2026 07:25

rumpl approved these changes Apr 28, 2026

View reviewed changes

dgageot merged commit 531701e into docker:main Apr 28, 2026
5 checks passed

BrewTestBot mentioned this pull request Apr 28, 2026

docker-agent 1.53.0 Homebrew/homebrew-core#279801

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(runtime): extract tool execution into pkg/runtime/toolexec#2545