feat(hooks): add before_compaction and after_compaction events#2537

dgageot · 2026-04-27T15:04:22Z

What this PR does

Adds two new hook events that make session compaction first-class in
the hooks mechanism, plus a refactor that splits the compaction
strategy into its own package so future strategies can land alongside
the default one without bloating pkg/runtime.

New hook events

before_compaction fires immediately before the runtime
compacts a session. The Input carries input_tokens,
output_tokens, context_limit, and a compaction_reason of
\"threshold\" (proactive 90% trigger), \"overflow\"
(post-overflow recovery) or \"manual\" (user-invoked
/compact). Hooks can:
- Veto the compaction (Decision: \"block\" or exit code 2)
- Replace the LLM-generated summary by returning a non-empty
  hookSpecificOutput.summary — the runtime applies that string
  verbatim and skips the model call entirely
after_compaction fires after a successful compaction. Receives
the produced summary together with the pre-compaction
input_tokens / output_tokens (what was summarized) so
observability handlers can naturally express "compacted from X to
Y". Purely observational; output is ignored.

Compactor extraction (`pkg/runtime/compactor`)

The LLM-summarization strategy and session-aware extraction helpers
move out of pkg/runtime into a new package, leaving
pkg/runtime/session_compaction.go as a thin orchestrator (~130 LoC,
down from ~200) that does only what's intrinsically runtime-private:
hook dispatch, session mutation, event emission, and persistence.

Pure-logic helpers (SplitIndexForKeep, FirstIndexInBudget) move
to pkg/compaction where they join ShouldCompact and
EstimateMessageTokens. The default LLM strategy
(compactor.RunLLM) becomes a public function that user-supplied
before_compaction hooks can call back into if they want to delegate
to the default with tweaks (different model, different prompt) instead
of copying the LLM-call boilerplate.

To avoid an import cycle (pkg/runtime/compactor doesn't import
pkg/runtime), RunLLM accepts a RunAgent callback that the
runtime supplies — the actual sub-runtime construction with
WithSessionCompaction(false) stays in the runtime.

Behavior preservation

The mission-critical "no hooks configured" path is bit-for-bit
unchanged. Three regression guards pin this:

TestCompaction (origin/main, end-to-end LLM compaction with a
fake stream provider) — passes
TestCompactionOverflowDoesNotLoop — passes
TestDoCompactNoHooksMatchesPriorBehavior (new) — asserts
identical event sequence (SessionCompaction(\"started\") → SessionSummary → SessionCompaction(\"completed\")) when no
compaction hooks are configured

Reviewer pass

A full review was performed before merging. The findings worth
calling out:

session_start re-emit on compact — initially I added a
Source=\"compact\" re-emit so hooks like add_environment_info
could re-inject context. Then upstream
#2536 ("don't
persist session_start hook output as a session message") landed
and made session_start output transient — threaded into every
model call automatically. The re-emit became redundant and was
dropped along with its test; session_start now fires exactly
once per RunStream as before.
after_compaction token counts — the reviewer pass moved this
contract to pre-compaction values ("compacted from X to Y")
before the API had any users. Asserted in
TestDoCompactAfterHookFires by seeding
sess.{Input,Output}Tokens=1234,567 and verifying those exact
numbers reach the hook.

Files changed

Area	Files
Hook contract	`pkg/hooks/types.go`, `pkg/hooks/executor.go`, `pkg/hooks/config.go`
Hook tests	`pkg/hooks/hooks_test.go`, `pkg/hooks/dispatch_test.go`
Runtime hook helpers	`pkg/runtime/hooks.go`
Compaction orchestrator	`pkg/runtime/session_compaction.go`, `pkg/runtime/loop.go`, `pkg/runtime/runtime.go`
New compactor package	`pkg/runtime/compactor/{compactor,compactor_test}.go`
Pure-logic helpers	`pkg/compaction/{compaction,compaction_test}.go`
Config + schema	`pkg/config/latest/types.go`, `agent-schema.json`
Example	`examples/hooks.yaml` (with a commented-out custom-summary block)
Runtime tests	`pkg/runtime/session_compaction_test.go`

Caveats documented in the schema and example

Vetoing on compaction_reason=\"overflow\" means the next LLM call
hits the same overflow. The runtime allows at most one
retry-with-compaction
(maxOverflowCompactions),
so a second denial fails the turn and surfaces an Error event
(not an infinite loop). Hook authors should gate denials by reason.

Verification

go build ./... — green
go test ./pkg/... ./cmd/... — all green
go test -race ./pkg/runtime/... ./pkg/compaction/... ./pkg/hooks/... ./pkg/config/... — green with -race
golangci-lint run ./... — 0 issues

Session compaction is now first-class in the hooks mechanism. Two new events let users observe, veto, or replace a compaction without touching the runtime: - before_compaction fires immediately before a compaction. The hook receives input_tokens, output_tokens, context_limit and a compaction_reason of "threshold" (proactive 90% trigger), "overflow" (post-overflow recovery) or "manual" (user-invoked /compact). It can veto the compaction (Decision: "block") or supply a custom summary string via HookSpecificOutput.summary, in which case the runtime applies that summary verbatim and skips the LLM-based summarization. - after_compaction fires once a summary has been applied to the session. Receives the produced summary text in Input.summary alongside the same token / reason fields. Purely observational. The runtime also now re-emits session_start with Source="compact" right after a successful compaction so hooks like add_environment_info can re-inject ambient context that may have been lost in the summary. This honors the previously dead "compact" Source value documented on hooks.Input.Source. When no compaction-related hooks are configured, control flow through doCompact is bit-for-bit identical to the previous implementation — guarded by a new TestDoCompactNoHooksMatchesPriorBehavior regression test. dispatchHook short-circuits when no executor is configured, so the hook dispatch sites are essentially free in the common case. Implementation notes: - pkg/hooks/types.go gets the two new EventType constants, two new Config fields, six new Input fields (input_tokens, output_tokens, context_limit, compaction_reason, summary; the existing source field is documented to include "compact"), and a Summary field on HookSpecificOutput plus Result. aggregate() surfaces the first non-empty Summary into Result.Summary; concurrent execution makes "first wins" deterministic at the value level (no concatenation, no clobbering). - pkg/runtime/hooks.go adds executeBeforeCompactionHooks, executeAfterCompactionHooks, and a parameterised executeSessionStartHooksWithSource (the existing executeSessionStartHooks is now a thin wrapper). - pkg/runtime/session_compaction.go threads a reason parameter into doCompact and adds a computeFirstKeptEntry helper so the hook-supplied-summary path keeps the same maxKeepTokens tail-keep policy as the LLM path. A new internal compactWithReason method on LocalRuntime forwards the reason; the public Summarize stays unchanged and reports "manual". loop.go callsites tag "threshold" and "overflow". Schema and example: - agent-schema.json adds the two new properties on HooksConfig with the full contract documented (TestSchemaMatchesGoTypes covers drift between schema and Go types). - examples/hooks.yaml adds two well-commented blocks demonstrating observational logging and veto/replace semantics, with a warning about denying on compaction_reason="overflow". Tests: - pkg/hooks: 5 new tests covering allow-by-default, exit-code-2 veto, HookSpecificOutput.summary surfacing into Result.Summary, first-summary-wins under concurrent dispatch, after_compaction's observational contract, and the JSON wire format. - pkg/runtime: 5 new tests covering deny-skips-everything, hook-summary-skips-LLM, after_compaction fires with the produced summary, session_start re-emit with Source="compact", and the no-hooks regression guard. Drive-by fix: a merge between docker#2524 (PrepareForkSubSession extraction) and the earlier model-override series left pkg/runtime/skill_runner.go referencing two no-longer-defined symbols (skill, params). PreparedSkillFork now carries the Model override and the runner reads from prepared.Model / prepared.SkillName. Assisted-By: docker-agent

Move the LLM-summarization strategy and session-aware extraction helpers out of the runtime into a new pkg/runtime/compactor package. The runtime keeps only what's intrinsically runtime-private: deciding when to compact (loop.go), dispatching hooks, and applying the produced summary to the session. The shape of the change: - pkg/compaction (pure logic, no session/agent deps) gains SplitIndexForKeep and FirstIndexInBudget — moved verbatim from the runtime's splitIndexForKeep / firstMessageToKeep with their user/assistant boundary-snap semantics intact. - pkg/runtime/compactor (new) owns the session-aware bits: ExtractMessages, ComputeFirstKeptEntry, MapToSessionIndex, and the default LLM strategy RunLLM. The strategy receives the runtime as a RunAgent callback rather than importing pkg/runtime, so the dependency direction stays one-way (runtime → compactor). MaxKeepTokens and MaxSummaryTokens move here as exported constants so user-supplied strategies can match the runtime's policy. - pkg/runtime/session_compaction.go shrinks from 197 LoC of mixed orchestration + LLM call + extraction helpers to ~130 LoC of pure orchestration: lookup contextLimit, dispatch before_compaction, choose strategy (hook-supplied or default LLM), apply the summary, emit events, dispatch after_compaction. The produceSummary helper makes the hook-vs-default branching obvious at a glance. Tests follow ownership: TestExtractMessages*, TestComputeFirstKeptEntry and TestMapToSessionIndex move to pkg/runtime/compactor; TestSplitIndexForKeep and a new TestFirstIndexInBudget land in pkg/compaction. The runtime keeps only the integration tests (TestDoCompact*, TestSessionGetMessages_*) and the gold-standard TestCompaction continues to exercise the full LLM-driven path through loop.go. Behaviorally: zero changes. TestCompaction, TestCompactionOverflowDoesNotLoop and TestDoCompactNoHooksMatchesPriorBehavior all pass — the sub-runtime-with-WithSessionCompaction(false) construction still lives in runCompactionAgent, so the existing recursion guard is unchanged. What strategy authors gain: a public compactor.RunLLM / compactor.LLMArgs / compactor.RunAgent surface they can call from a custom before_compaction hook to delegate back to the default strategy with tweaks (different prompt, different model) instead of copying ~40 lines of LLM-call boilerplate. Assisted-By: docker-agent

Three small simplifications to make compaction easier to read and maintain. No behavior change — every existing test (TestCompaction, TestCompactionOverflowDoesNotLoop, TestDoCompact*, TestExtractMessages, TestSplitIndexForKeep, ...) passes unchanged with -race. 1. pkg/runtime/compactor: tighten the public surface Unexport MaxKeepTokens, ExtractMessages, and MapToSessionIndex. They were exported on the speculation that future Go-level strategy authors might want them, but: (a) the package lives under pkg/runtime/ which already signals internal scope, (b) user-supplied strategies are JSON-out before_compaction hooks that don't see Go symbols anyway, and (c) any future strategy added inside this package can use the unexported names. They can always be re-exported when a real client emerges (YAGNI). The remaining public surface is the minimum the runtime needs: MaxSummaryTokens, Result, RunAgent, LLMArgs, RunLLM, and ComputeFirstKeptEntry. 2. pkg/runtime/compactor: extract nonSystemMessages helper Both extractMessages and ComputeFirstKeptEntry started with the same loop: var messages []chat.Message for _, msg := range sess.GetMessages(a) { if msg.Role == chat.MessageRoleSystem { continue } messages = append(messages, msg) } Moved into a single nonSystemMessages helper, plus the cost / cache-control clear in extractMessages now mutates in place via indexing rather than a copy-then-append, which makes the intent ("clear these flags on the local copy") more obvious. toItems also gets a make([]session.Item, len) preallocation while we're here. 3. pkg/runtime/session_compaction.go: flatten doCompact The previous shape used a produceSummary helper that took six parameters and dispatched between the hook-supplied and LLM-default paths internally. Replaced with: - summaryFromHook(sess, a, pre) — a tiny free function that lifts a before_compaction Summary into a compactor.Result when one was supplied, or returns nil to signal "fall through to the LLM path". - The LLM path is now inline in doCompact, so the orchestrator reads top-to-bottom: lookup contextLimit → before_compaction veto-check → started event → choose strategy (hook or LLM) → apply summary → SessionSummary → after_compaction. No more six-parameter helper, no more nested return paths through a separate function. Net effect: doCompact loses its inner-helper indirection but keeps the same number of error returns, and the "hook-supplied-vs-LLM-default" decision is now a single nil check at the doCompact level. Diff summary across the three files: -118 / +115 lines, with a proportional reduction in cross-function indirection (one fewer helper in session_compaction.go, two fewer exported symbols in compactor). Assisted-By: docker-agent

Triaged the reviewer findings against the code. One real fix and a batch of documentation / contract clarifications. None of the alleged P0/P1 correctness regressions held up: - SplitIndexForKeep / FirstIndexInBudget allegedly inverted: misread. The functions return the boundary INDEX (messages[:N] is the compact window, messages[N:] is the keep window). Returning len(messages) when everything fits — i.e. "compact everything, keep nothing" — is the documented and tested contract from origin/main. "Fixing" it to return 0 would break manual /compact on small sessions (it would become a no-op). - session_start started emitted before deny check: misread. The deny return is on line 65, the started emit on line 69. Asserted by TestDoCompactBeforeHookDeniesSkipsCompaction. - Session-mutation race: pre-existing property of the codebase, not introduced by this PR. All compaction call sites are on the main RunStream goroutine; if that ever changes, the fix is at the session layer, not here. - Token estimation mismatch and silent truncation on tiny budgets: pre-existing edge cases inherited from origin/main. Real fix: - pkg/runtime/{session_compaction,hooks}.go: after_compaction now receives the *pre-compaction* InputTokens / OutputTokens (what was summarized) instead of the post-compaction values. This is the natural shape for an observability hook ("compacted from X to Y"); the post-compaction counts already reach UIs through the next TokenUsageEvent. Snapshot before mutation, pass into executeAfterCompactionHooks, plumbed through. The new TestDoCompactAfterHookFires seeds sess.{Input,Output}Tokens=1234,567 and asserts those exact values reach the hook (a regression to post-compaction would surface as ~EstimateMessageTokens(summary)|0). Documentation clarifications (no behavior change): - pkg/compaction/compaction.go: rewrote SplitIndexForKeep and FirstIndexInBudget docs to make the boundary-INDEX contract unambiguous (messages[:N] / messages[N:]) and to spell out why the "all fits → compact everything" branch is the right answer for manual /compact, threshold compaction on tiny windows, and overflow recovery. Renamed the relevant test case so its intent is obvious. - pkg/hooks/types.go: EventBeforeCompaction overflow-denial warning now spells out the actual consequence (one retry-with-compaction guarded by maxOverflowCompactions, then an Error event — not an infinite loop). EventAfterCompaction docstring documents the pre-compaction-token-counts contract. Input.ContextLimit and Input.Summary docstrings explain when each is zero / empty. - pkg/hooks/executor.go: aggregate's "first summary wins" comment now spells out "first in CONFIG order, hooks run concurrently but write to indexed slots" so future readers can't suspect an order race. - pkg/runtime/compactor/compactor.go: extractMessages clearing of Cost / CacheControl now has a comment explaining why (avoid double-counting cost; don't pin the parent's cache checkpoint inside the throwaway compaction sub-call). - pkg/runtime/session_compaction.go: summaryFromHook clarifies that Result.Cost is left at zero because no LLM call ran. - examples/hooks.yaml: the after_compaction example now shows the "compacted from X to Y" pattern; added a commented-out custom-summary example demonstrating the non-LLM replacement path (use cases: deterministic strategies, regulated environments, deterministic CI tests). Assisted-By: docker-agent

dgageot requested a review from a team as a code owner April 27, 2026 15:04

dgageot force-pushed the board/hooks-mechanism-support-for-session-comp-60daade3 branch from 3852623 to f26a89c Compare April 27, 2026 17:13

rumpl previously approved these changes Apr 28, 2026

View reviewed changes

dgageot added 4 commits April 28, 2026 09:14

dgageot dismissed rumpl’s stale review via 1e9512e April 28, 2026 07:19

dgageot force-pushed the board/hooks-mechanism-support-for-session-comp-60daade3 branch from f26a89c to 1e9512e Compare April 28, 2026 07:19

rumpl approved these changes Apr 28, 2026

View reviewed changes

dgageot merged commit 73a4dbc into docker:main Apr 28, 2026
5 checks passed

BrewTestBot mentioned this pull request Apr 28, 2026

docker-agent 1.53.0 Homebrew/homebrew-core#279801

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(hooks): add before_compaction and after_compaction events#2537