feat(hooks): add before_compaction and after_compaction events#2537
Merged
dgageot merged 4 commits intodocker:mainfrom Apr 28, 2026
Merged
Conversation
3852623 to
f26a89c
Compare
rumpl
previously approved these changes
Apr 28, 2026
Session compaction is now first-class in the hooks mechanism. Two new events let users observe, veto, or replace a compaction without touching the runtime: - before_compaction fires immediately before a compaction. The hook receives input_tokens, output_tokens, context_limit and a compaction_reason of "threshold" (proactive 90% trigger), "overflow" (post-overflow recovery) or "manual" (user-invoked /compact). It can veto the compaction (Decision: "block") or supply a custom summary string via HookSpecificOutput.summary, in which case the runtime applies that summary verbatim and skips the LLM-based summarization. - after_compaction fires once a summary has been applied to the session. Receives the produced summary text in Input.summary alongside the same token / reason fields. Purely observational. The runtime also now re-emits session_start with Source="compact" right after a successful compaction so hooks like add_environment_info can re-inject ambient context that may have been lost in the summary. This honors the previously dead "compact" Source value documented on hooks.Input.Source. When no compaction-related hooks are configured, control flow through doCompact is bit-for-bit identical to the previous implementation — guarded by a new TestDoCompactNoHooksMatchesPriorBehavior regression test. dispatchHook short-circuits when no executor is configured, so the hook dispatch sites are essentially free in the common case. Implementation notes: - pkg/hooks/types.go gets the two new EventType constants, two new Config fields, six new Input fields (input_tokens, output_tokens, context_limit, compaction_reason, summary; the existing source field is documented to include "compact"), and a Summary field on HookSpecificOutput plus Result. aggregate() surfaces the first non-empty Summary into Result.Summary; concurrent execution makes "first wins" deterministic at the value level (no concatenation, no clobbering). - pkg/runtime/hooks.go adds executeBeforeCompactionHooks, executeAfterCompactionHooks, and a parameterised executeSessionStartHooksWithSource (the existing executeSessionStartHooks is now a thin wrapper). - pkg/runtime/session_compaction.go threads a reason parameter into doCompact and adds a computeFirstKeptEntry helper so the hook-supplied-summary path keeps the same maxKeepTokens tail-keep policy as the LLM path. A new internal compactWithReason method on LocalRuntime forwards the reason; the public Summarize stays unchanged and reports "manual". loop.go callsites tag "threshold" and "overflow". Schema and example: - agent-schema.json adds the two new properties on HooksConfig with the full contract documented (TestSchemaMatchesGoTypes covers drift between schema and Go types). - examples/hooks.yaml adds two well-commented blocks demonstrating observational logging and veto/replace semantics, with a warning about denying on compaction_reason="overflow". Tests: - pkg/hooks: 5 new tests covering allow-by-default, exit-code-2 veto, HookSpecificOutput.summary surfacing into Result.Summary, first-summary-wins under concurrent dispatch, after_compaction's observational contract, and the JSON wire format. - pkg/runtime: 5 new tests covering deny-skips-everything, hook-summary-skips-LLM, after_compaction fires with the produced summary, session_start re-emit with Source="compact", and the no-hooks regression guard. Drive-by fix: a merge between docker#2524 (PrepareForkSubSession extraction) and the earlier model-override series left pkg/runtime/skill_runner.go referencing two no-longer-defined symbols (skill, params). PreparedSkillFork now carries the Model override and the runner reads from prepared.Model / prepared.SkillName. Assisted-By: docker-agent
Move the LLM-summarization strategy and session-aware extraction helpers out of the runtime into a new pkg/runtime/compactor package. The runtime keeps only what's intrinsically runtime-private: deciding when to compact (loop.go), dispatching hooks, and applying the produced summary to the session. The shape of the change: - pkg/compaction (pure logic, no session/agent deps) gains SplitIndexForKeep and FirstIndexInBudget — moved verbatim from the runtime's splitIndexForKeep / firstMessageToKeep with their user/assistant boundary-snap semantics intact. - pkg/runtime/compactor (new) owns the session-aware bits: ExtractMessages, ComputeFirstKeptEntry, MapToSessionIndex, and the default LLM strategy RunLLM. The strategy receives the runtime as a RunAgent callback rather than importing pkg/runtime, so the dependency direction stays one-way (runtime → compactor). MaxKeepTokens and MaxSummaryTokens move here as exported constants so user-supplied strategies can match the runtime's policy. - pkg/runtime/session_compaction.go shrinks from 197 LoC of mixed orchestration + LLM call + extraction helpers to ~130 LoC of pure orchestration: lookup contextLimit, dispatch before_compaction, choose strategy (hook-supplied or default LLM), apply the summary, emit events, dispatch after_compaction. The produceSummary helper makes the hook-vs-default branching obvious at a glance. Tests follow ownership: TestExtractMessages*, TestComputeFirstKeptEntry and TestMapToSessionIndex move to pkg/runtime/compactor; TestSplitIndexForKeep and a new TestFirstIndexInBudget land in pkg/compaction. The runtime keeps only the integration tests (TestDoCompact*, TestSessionGetMessages_*) and the gold-standard TestCompaction continues to exercise the full LLM-driven path through loop.go. Behaviorally: zero changes. TestCompaction, TestCompactionOverflowDoesNotLoop and TestDoCompactNoHooksMatchesPriorBehavior all pass — the sub-runtime-with-WithSessionCompaction(false) construction still lives in runCompactionAgent, so the existing recursion guard is unchanged. What strategy authors gain: a public compactor.RunLLM / compactor.LLMArgs / compactor.RunAgent surface they can call from a custom before_compaction hook to delegate back to the default strategy with tweaks (different prompt, different model) instead of copying ~40 lines of LLM-call boilerplate. Assisted-By: docker-agent
Three small simplifications to make compaction easier to read and
maintain. No behavior change — every existing test (TestCompaction,
TestCompactionOverflowDoesNotLoop, TestDoCompact*, TestExtractMessages,
TestSplitIndexForKeep, ...) passes unchanged with -race.
1. pkg/runtime/compactor: tighten the public surface
Unexport MaxKeepTokens, ExtractMessages, and MapToSessionIndex.
They were exported on the speculation that future Go-level
strategy authors might want them, but: (a) the package lives
under pkg/runtime/ which already signals internal scope, (b)
user-supplied strategies are JSON-out before_compaction hooks
that don't see Go symbols anyway, and (c) any future strategy
added inside this package can use the unexported names. They
can always be re-exported when a real client emerges (YAGNI).
The remaining public surface is the minimum the runtime needs:
MaxSummaryTokens, Result, RunAgent, LLMArgs, RunLLM, and
ComputeFirstKeptEntry.
2. pkg/runtime/compactor: extract nonSystemMessages helper
Both extractMessages and ComputeFirstKeptEntry started with the
same loop:
var messages []chat.Message
for _, msg := range sess.GetMessages(a) {
if msg.Role == chat.MessageRoleSystem { continue }
messages = append(messages, msg)
}
Moved into a single nonSystemMessages helper, plus the cost /
cache-control clear in extractMessages now mutates in place via
indexing rather than a copy-then-append, which makes the intent
("clear these flags on the local copy") more obvious. toItems
also gets a make([]session.Item, len) preallocation while we're
here.
3. pkg/runtime/session_compaction.go: flatten doCompact
The previous shape used a produceSummary helper that took six
parameters and dispatched between the hook-supplied and
LLM-default paths internally. Replaced with:
- summaryFromHook(sess, a, pre) — a tiny free function that
lifts a before_compaction Summary into a compactor.Result
when one was supplied, or returns nil to signal "fall through
to the LLM path".
- The LLM path is now inline in doCompact, so the orchestrator
reads top-to-bottom: lookup contextLimit → before_compaction
veto-check → started event → choose strategy (hook or LLM) →
apply summary → SessionSummary → after_compaction. No more
six-parameter helper, no more nested return paths through a
separate function.
Net effect: doCompact loses its inner-helper indirection but
keeps the same number of error returns, and the
"hook-supplied-vs-LLM-default" decision is now a single nil
check at the doCompact level.
Diff summary across the three files: -118 / +115 lines, with a
proportional reduction in cross-function indirection (one fewer
helper in session_compaction.go, two fewer exported symbols in
compactor).
Assisted-By: docker-agent
Triaged the reviewer findings against the code. One real fix and a
batch of documentation / contract clarifications. None of the alleged
P0/P1 correctness regressions held up:
- SplitIndexForKeep / FirstIndexInBudget allegedly inverted: misread.
The functions return the boundary INDEX (messages[:N] is the compact
window, messages[N:] is the keep window). Returning len(messages)
when everything fits — i.e. "compact everything, keep nothing" — is
the documented and tested contract from origin/main. "Fixing" it to
return 0 would break manual /compact on small sessions (it would
become a no-op).
- session_start started emitted before deny check: misread. The deny
return is on line 65, the started emit on line 69. Asserted by
TestDoCompactBeforeHookDeniesSkipsCompaction.
- Session-mutation race: pre-existing property of the codebase, not
introduced by this PR. All compaction call sites are on the main
RunStream goroutine; if that ever changes, the fix is at the session
layer, not here.
- Token estimation mismatch and silent truncation on tiny budgets:
pre-existing edge cases inherited from origin/main.
Real fix:
- pkg/runtime/{session_compaction,hooks}.go: after_compaction now
receives the *pre-compaction* InputTokens / OutputTokens (what was
summarized) instead of the post-compaction values. This is the
natural shape for an observability hook ("compacted from X to Y");
the post-compaction counts already reach UIs through the next
TokenUsageEvent. Snapshot before mutation, pass into
executeAfterCompactionHooks, plumbed through. The new
TestDoCompactAfterHookFires seeds sess.{Input,Output}Tokens=1234,567
and asserts those exact values reach the hook (a regression to
post-compaction would surface as ~EstimateMessageTokens(summary)|0).
Documentation clarifications (no behavior change):
- pkg/compaction/compaction.go: rewrote SplitIndexForKeep and
FirstIndexInBudget docs to make the boundary-INDEX contract
unambiguous (messages[:N] / messages[N:]) and to spell out why the
"all fits → compact everything" branch is the right answer for
manual /compact, threshold compaction on tiny windows, and overflow
recovery. Renamed the relevant test case so its intent is obvious.
- pkg/hooks/types.go: EventBeforeCompaction overflow-denial warning
now spells out the actual consequence (one retry-with-compaction
guarded by maxOverflowCompactions, then an Error event — not an
infinite loop). EventAfterCompaction docstring documents the
pre-compaction-token-counts contract. Input.ContextLimit and
Input.Summary docstrings explain when each is zero / empty.
- pkg/hooks/executor.go: aggregate's "first summary wins" comment now
spells out "first in CONFIG order, hooks run concurrently but write
to indexed slots" so future readers can't suspect an order race.
- pkg/runtime/compactor/compactor.go: extractMessages clearing of
Cost / CacheControl now has a comment explaining why (avoid
double-counting cost; don't pin the parent's cache checkpoint inside
the throwaway compaction sub-call).
- pkg/runtime/session_compaction.go: summaryFromHook clarifies that
Result.Cost is left at zero because no LLM call ran.
- examples/hooks.yaml: the after_compaction example now shows the
"compacted from X to Y" pattern; added a commented-out custom-summary
example demonstrating the non-LLM replacement path (use cases:
deterministic strategies, regulated environments, deterministic
CI tests).
Assisted-By: docker-agent
f26a89c to
1e9512e
Compare
rumpl
approved these changes
Apr 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does
Adds two new hook events that make session compaction first-class in
the hooks mechanism, plus a refactor that splits the compaction
strategy into its own package so future strategies can land alongside
the default one without bloating
pkg/runtime.New hook events
before_compactionfires immediately before the runtimecompacts a session. The Input carries
input_tokens,output_tokens,context_limit, and acompaction_reasonof\"threshold\"(proactive 90% trigger),\"overflow\"(post-overflow recovery) or
\"manual\"(user-invoked/compact). Hooks can:\"block\"or exit code 2)hookSpecificOutput.summary— the runtime applies that stringverbatim and skips the model call entirely
after_compactionfires after a successful compaction. Receivesthe produced
summarytogether with the pre-compactioninput_tokens/output_tokens(what was summarized) soobservability handlers can naturally express "compacted from X to
Y". Purely observational; output is ignored.
Compactor extraction (
pkg/runtime/compactor)The LLM-summarization strategy and session-aware extraction helpers
move out of
pkg/runtimeinto a new package, leavingpkg/runtime/session_compaction.goas a thin orchestrator (~130 LoC,down from ~200) that does only what's intrinsically runtime-private:
hook dispatch, session mutation, event emission, and persistence.
Pure-logic helpers (
SplitIndexForKeep,FirstIndexInBudget) moveto
pkg/compactionwhere they joinShouldCompactandEstimateMessageTokens. The default LLM strategy(
compactor.RunLLM) becomes a public function that user-suppliedbefore_compactionhooks can call back into if they want to delegateto the default with tweaks (different model, different prompt) instead
of copying the LLM-call boilerplate.
To avoid an import cycle (
pkg/runtime/compactordoesn't importpkg/runtime),RunLLMaccepts aRunAgentcallback that theruntime supplies — the actual sub-runtime construction with
WithSessionCompaction(false)stays in the runtime.Behavior preservation
The mission-critical "no hooks configured" path is bit-for-bit
unchanged. Three regression guards pin this:
TestCompaction(origin/main, end-to-end LLM compaction with afake stream provider) — passes
TestCompactionOverflowDoesNotLoop— passesTestDoCompactNoHooksMatchesPriorBehavior(new) — assertsidentical event sequence (
SessionCompaction(\"started\") → SessionSummary → SessionCompaction(\"completed\")) when nocompaction hooks are configured
Reviewer pass
A full review was performed before merging. The findings worth
calling out:
session_startre-emit on compact — initially I added aSource=\"compact\"re-emit so hooks likeadd_environment_infocould re-inject context. Then upstream
#2536 ("don't
persist session_start hook output as a session message") landed
and made
session_startoutput transient — threaded into everymodel call automatically. The re-emit became redundant and was
dropped along with its test;
session_startnow fires exactlyonce per
RunStreamas before.after_compactiontoken counts — the reviewer pass moved thiscontract to pre-compaction values ("compacted from X to Y")
before the API had any users. Asserted in
TestDoCompactAfterHookFiresby seedingsess.{Input,Output}Tokens=1234,567and verifying those exactnumbers reach the hook.
Files changed
pkg/hooks/types.go,pkg/hooks/executor.go,pkg/hooks/config.gopkg/hooks/hooks_test.go,pkg/hooks/dispatch_test.gopkg/runtime/hooks.gopkg/runtime/session_compaction.go,pkg/runtime/loop.go,pkg/runtime/runtime.gopkg/runtime/compactor/{compactor,compactor_test}.gopkg/compaction/{compaction,compaction_test}.gopkg/config/latest/types.go,agent-schema.jsonexamples/hooks.yaml(with a commented-out custom-summary block)pkg/runtime/session_compaction_test.goCaveats documented in the schema and example
Vetoing on
compaction_reason=\"overflow\"means the next LLM callhits the same overflow. The runtime allows at most one
retry-with-compaction
(
maxOverflowCompactions),so a second denial fails the turn and surfaces an
Errorevent(not an infinite loop). Hook authors should gate denials by reason.
Verification
go build ./...— greengo test ./pkg/... ./cmd/...— all greengo test -race ./pkg/runtime/... ./pkg/compaction/... ./pkg/hooks/... ./pkg/config/...— green with-racegolangci-lint run ./...—0 issues