feat(hooks): add 6 builtin hooks + widen post_tool_use / before_llm_call contract#2538

dgageot · 2026-04-27T15:04:50Z

Summary

Adds 6 new in-process hook builtins and a small hook-contract widening (post_tool_use / before_llm_call may now signal run termination via decision: "block"). The existing inline tool-loop detector is unchanged.

What's new

Five context-injecting builtins (commit 1)

Each follows the same shape as the existing add_date / add_environment_info / add_prompt_files. Opt-in via YAML — none auto-injected, since they shell out / read the filesystem and shouldn't run silently.

Name	Event	Behavior
`add_git_status`	`turn_start`	`git status --short --branch`
`add_git_diff`	`turn_start`	`git diff --stat` (or full unified with `args: [full]`)
`add_directory_listing`	`session_start`	Sorted top-level entries of `Cwd`, hidden skipped, capped at 100
`add_user_info`	`session_start`	`os/user.Current()` + `os.Hostname()`
`add_recent_commits`	`session_start`	`git log --oneline -n N` (default 10, override via `args: ["20"]`)

All git-backed builtins cap output at 4 KiB and treat git-not-installed / non-repo failures as graceful no-ops, so they can never abort a session or turn.

`max_iterations` builtin (commit 2)

A new stateful before_llm_call builtin — hard stop after N model calls, configurable per-agent in YAML:

hooks:
  before_llm_call:
    - {type: builtin, command: max_iterations, args: ["50"]}

Additive, not a replacement. The existing agent.MaxIterations flag with its MaxIterationsReachedEvent and interactive resume dialog is unchanged — TUI / CLI / ACP all still use it. The new builtin is a hard-stop alternative for headless / cost-control scenarios.

Per-session counter cleared from session_end via builtins.State.ClearSession.

Hook contract widening (commit 2)

Documents and enforces that decision: "block" (or continue: false / exit code 2) from a post_tool_use or before_llm_call hook signals run termination. Previously these events were observational only. The runtime now translates a deny verdict into the standard Error + notification(level=error) + on_error fanout via a new emitHookDrivenShutdown helper.

User-authored hooks can now stop the run; the inline tool-loop detector is unchanged.

Refactors and fixes (commits 3 + 4)

processToolCalls / executeWithApproval / askUserForConfirmation: closure plumbing simplified — LocalRuntime.runTool now returns a toolApprovalOutcome so callers can return invoke() everywhere instead of unpacking and repacking a (stop, msg) tuple in 7 places.
loop_detector per-call vs per-batch behavior change: reverted in commit 5 after review concerns. The inline tool_loop_detector is byte-identical to origin/main.
Bug fix: before_llm_call was dispatched twice per loop iteration after a rebase; once-per-iteration is now pinned by a regression test (pkg/runtime/before_llm_call_test.go).
sortKeys (private, used by builtins) made non-mutating to avoid a foot-gun for future callers.
gitOutput's empty-dir error message trimmed to Go style (no function name prefix).
Stale comment in hooks_wiring_test.go updated post eager-build refactor.

Commits

feat(hooks/builtins): add five context-injecting builtins
feat(hooks): widen contract for post_tool_use / before_llm_call + extract loop_detector & max_iterations builtins
refactor(hooks/runtime): simplify approval plumbing and trim docstrings
fix(runtime,hooks): review-driven fixes (duplicate before_llm_call dispatch + 4 quality)
revert(hooks): drop loop_detector builtin, keep inline tool-loop detector

(Each commit message has full details — left as separate commits for review legibility; squash-or-merge as you prefer.)

Validation

go build ./... ✅
go test ./... (97 packages) → all pass
go test -race ./pkg/hooks/... ./pkg/runtime/... → clean
go test ./e2e/... → pass
go test ./pkg/tui/... ./pkg/cli/... ./pkg/acp/... → pass
golangci-lint run ./... → 0 issues
Full binary builds (go build .)

Behavior changes for existing users

None for existing in-repo configs. Loop detection, max_iterations UX, and all default agent flags behave identically to origin/main.
The post_tool_use / before_llm_call deny-verdict contract is now honored. If a user happened to have a hook on those events returning decision: "block" it would have been silently ignored before; it now actually stops the run. No in-repo configs do this.

Honest caveats

All in-repo tests pass (unit + integration + e2e + TUI + CLI + ACP), but I haven't exercised the runtime against a real LLM. The contract widening tests pin executor-level semantics; integration tests cover the runtime fanout.
The new git-shelling builtins are bounded by the hooks executor's 60 s timeout — pathological repos will time out cleanly rather than hang.

🤖 Generated with Claude Code via docker-agent

Adds five new in-process hook builtins that follow the same pattern as add_date / add_environment_info / add_prompt_files. They are opt-in via `{type: builtin, command: ...}` in YAML and not auto-injected from any agent flag, since they shell out / read the filesystem and shouldn't run silently: - add_git_status (turn_start) `git status --short --branch` - add_git_diff (turn_start) `git diff --stat`, "full" arg for unified - add_directory_listing (session_start) sorted top-level entries, hidden skipped - add_user_info (session_start) os/user.Current + os.Hostname - add_recent_commits (session_start) `git log --oneline -n N` (default 10) All git-shelling builtins cap output at 4 KiB and treat git-not-installed or non-repo failures as graceful no-ops so they can never abort a session or turn. add_directory_listing caps the visible entries at 100 with a "... and N more" summary. Adds a private `gitOutput` helper, a `sessionStartContext` mirror of the existing `turnStartContext`, and an `initGitRepo` test helper that skips the suite when git isn't installed. Assisted-By: docker-agent

…ract loop_detector & max_iterations builtins Documents that decision="block" / continue=false / exit code 2 from a post_tool_use or before_llm_call hook signals run termination — not just the previously-undocumented behaviour of pre_tool_use deny. The executor's aggregate logic already produced Allowed=false uniformly for these verdicts; the runtime now actually acts on them. Adds `hooks.DecisionBlockValue` so in-process builtins don't carry the "block" string literal. * `executePostToolHook` and `executeBeforeLLMCallHooks` now return (stop, message). `processToolCalls` aggregates and propagates the post-tool stop up to the run loop, synthesising error responses for any unprocessed tool calls in the batch (same shape as user cancellation) so the API doesn't reject orphan function calls. * New `emitHookDrivenShutdown` factors the standard Error / notification(level=error) / on_error fanout used by both the post_tool_use and before_llm_call shutdown paths. * `executeWithApproval` now returns a `toolApprovalOutcome` struct instead of a single canceled bool to thread the new stop signal through every approval path (yolo, allow, force-ask, default, user prompt). Replaces the inline `tool_loop_detector` previously in pkg/runtime. Stateful per-session via `builtins.State`, registered through `builtins.Register` and cleared on session_end. Auto-injected from `agent.MaxConsecutiveToolCalls()` (defaulting to 5 to preserve the historical always-on contract). **Behaviour change:** signatures are now per-call, not per-batch. Single- tool repetition (the dominant stuck-agent pattern) and parallel-identical batches still trip; alternating multi-tool batches like `[A,B] [A,B] [A,B]` no longer trip — that case should now be caught by max_iterations or manual threshold tuning. Polling tools (view_background_agent, view_background_job) remain invisible to the counter. **Additive**, not a replacement: the existing `agent.MaxIterations` flow (MaxIterationsReachedEvent + interactive resume dialog in TUI/CLI/ACP) is unchanged because those UIs depend on the special event. The new builtin gives users a hard-stop alternative they can configure purely in YAML, with no resume protocol and no special event: hooks: before_llm_call: - {type: builtin, command: max_iterations, args: ["50"]} * `pkg/hooks/contract_widening_test.go` pins decision=block and continue=false producing Allowed=false on post_tool_use and before_llm_call. * `pkg/hooks/builtins/loop_detector_test.go` covers threshold tripping, JSON-key-order normalisation, exempt-tool invisibility, per-session isolation, lenient arg parsing, and concurrent safety. * `pkg/hooks/builtins/max_iterations_test.go` covers limit tripping, per-session isolation, lenient arg parsing, and concurrent safety. * `hooks_wiring_test.go` updated to reflect that loop_detector is auto-injected on every agent. * `pkg/runtime/tool_loop_detector.go` and its test (logic moved to `pkg/hooks/builtins/loop_detector.go` with per-call semantics). Lint clean, all tests pass with -race. Assisted-By: docker-agent

Pure cleanup pass over the two preceding feature commits. No features removed, no public API changes. * `LocalRuntime.runTool` now returns `toolApprovalOutcome` directly, so callers don't repeat `stop, msg := runTool(); return outcome{...}` in seven places. * Local `runTool` closure in `processToolCalls` is renamed to `invoke` (no more shadowing the method) and extracted into a new `toolInvoker` helper for clarity. * `executeWithApproval` and `askUserForConfirmation` now `return invoke()` in every approval path \u2014 every one of those paths used to be a two-line tuple-unpack-and-pack. * `processToolCalls` synthesised-error loops are extracted into a `synthesizeRemaining` closure and wrapped in a `switch` with `canceled`/`stopRun`/`default` arms so the control flow is visible at a glance. * `loop_detector`: inline `parseLoopDetectorArgs`; replace the per-call `map[string]struct{}` exempt set with `slices.Contains` over a slice (no allocations on the hot path); shorter state field names (`sig`, `count`). * `max_iterations`: inline `parseMaxIterationsArgs`; collapse the `(int, bool)` parse-result into a single `limit\u003c=0\u200a\u2192\u200ano-op` check. * `json.go`: previously held `canonicalJSON` and `joinNonEmpty` helpers that were already deleted as unused; the file is now just `sortKeys` + `canonicalToolInput`. Trimmed the verbose comments added in the previous two commits to match the surrounding house style. The contract is the same; the prose just earned its keep: * `emitHookDrivenShutdown`, `executeBeforeLLMCallHooks`, `executeSessionEndHooks`, `loopDetectorExemptTools`, the `maxConsecutive` normalisation block, the `builtinsState` field comment. * Package doc and `State` / `Register` / `AgentDefaults` doc in pkg/hooks/builtins. * `hooks.EventPostToolUse`, `hooks.EventBeforeLLMCall`, and `hooks.DecisionBlockValue` docs. 9 files changed, 238 insertions(+), 395 deletions(-) Lint clean. `go test ./...` and `go test -race ./pkg/hooks/... ./pkg/runtime/...` all pass. Assisted-By: docker-agent

…spatch + 4 quality) Review of the three feature commits surfaced one critical bug introduced during the rebase plus a handful of low-impact issues. Fixes them, with a regression test for the critical one. ## Critical bug **before_llm_call hook fired twice per loop iteration.** The rebase left two identical dispatch blocks in [LocalRuntime.RunStream], so any stateful before_llm_call hook \u2014 prominently the new max_iterations builtin \u2014 advanced its counter twice per LLM call and tripped at half the configured limit. Other affected handlers: any user-authored before_llm_call hook with side effects (audit logging, cost meters). Adds pkg/runtime/before_llm_call_test.go to pin "fires exactly once per iteration". I verified the test fails on the buggy code and passes on the fix before checking either in. ## Quality fixes * pkg/hooks/builtins/json.go: sortKeys was mutating []any slices in place. Currently safe because each builtin gets a freshly-decoded Input, but it's a foot-gun for any future caller that re-uses inputs. sortKeys now returns a deep copy. * pkg/runtime/tool_dispatch.go: processToolCalls switch had three arms each ending in nearly-identical span End() / SetStatus (codes.Ok, "...") boilerplate. Pulled span finalisation up before the switch so each arm only carries the logic that's actually distinct. * pkg/hooks/builtins/git.go: gitOutput's empty-dir error included the function name ("gitOutput: empty working directory") against Go style. Trimmed. * pkg/runtime/hooks_wiring_test.go: stale comment referred to "caching by agent name"; the post-rebase eager-build doesn't cache, it just looks up. Updated. ## Validation go test ./... -> all packages pass go test -race ./pkg/hooks/... ./pkg/runtime/... -> clean golangci-lint run ./... -> 0 issues Assisted-By: docker-agent

…ctor Reverts the loop-detector portion of the builtin extraction. The behavior change from per-batch to per-call detection turned out to be the wrong trade-off: alternating multi-tool batches like `[A,B] [A,B]` are a real failure mode that the inline detector caught and the per-call builtin missed. Surgical revert \u2014 keeps the rest of the work intact: * post_tool_use / before_llm_call deny contract widening stays (general feature, no behavior change without an explicit hook). * max_iterations builtin stays (separate from loop detection, no change to the existing agent.MaxIterations UX). * The five context-injecting builtins stay. * The duplicate-dispatch fix and simplifications stay. ## Removed * pkg/hooks/builtins/loop_detector.go and its test. * pkg/hooks/builtins/json.go (canonicalToolInput / sortKeys were only consumed by the loop_detector builtin). * `builtins.LoopDetector` constant. * `AgentDefaults.MaxConsecutiveToolCalls` and `ExemptLoopTools` fields, plus the auto-injection branch in ApplyAgentDefaults. * `State.loopDetector` field. ## Restored from origin/main * pkg/runtime/tool_loop_detector.go (per-batch signature detector). * pkg/runtime/tool_loop_detector_test.go. * Inline `loopDetector := newToolLoopDetector(...)` setup at the top of RunStream's goroutine. * `if loopDetector.record(res.Calls)` block emitting Error + notifyError after a tool batch in loop.go. * `MaxConsecutiveToolCalls` is once again read straight from the session, with the historic threshold-of-5 default. ## Tests * hooks_wiring_test.go: "no flags: no implicit hooks, no executor" case is back; loop_detector-specific assertions removed. * builtins_test.go: removed LoopDetector from the registration roster. ## Validation go build ./... -> ok go test ./... -> all packages pass go test -race ./pkg/hooks/... ./pkg/runtime/... -> clean golangci-lint run ./... -> 0 issues Assisted-By: docker-agent

rumpl

Yes I read it

dgageot added 5 commits April 27, 2026 16:20

dgageot requested a review from a team as a code owner April 27, 2026 15:04

rumpl approved these changes Apr 27, 2026

View reviewed changes

dgageot merged commit 64b2bd1 into docker:main Apr 27, 2026
9 checks passed

BrewTestBot mentioned this pull request Apr 28, 2026

docker-agent 1.53.0 Homebrew/homebrew-core#279801

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(hooks): add 6 builtin hooks + widen post_tool_use / before_llm_call contract#2538