Skip to content

Skills: allow fork skills to override the model#2525

Merged
dgageot merged 6 commits intodocker:mainfrom
dgageot:board/skill-model-override-capability-4bed7977
Apr 27, 2026
Merged

Skills: allow fork skills to override the model#2525
dgageot merged 6 commits intodocker:mainfrom
dgageot:board/skill-model-override-capability-4bed7977

Conversation

@dgageot
Copy link
Copy Markdown
Member

@dgageot dgageot commented Apr 27, 2026

Summary

Allow fork skills (context: fork) to override the model used while running as a sub-agent, via a new model: field in SKILL.md frontmatter. Useful when a skill is best handled by a faster, cheaper, or more specialised model than the parent agent's default — e.g. a powerful reasoning model for refactors, or a fast model for routine bookkeeping.

---
name: bump-go-dependencies
description: Update Go module dependencies one by one
context: fork
model: openai/gpt-4o-mini
---

The model value accepts the same forms as the rest of the agent config: a named model from the config, an inline provider/model reference, or a comma-separated alloy. It is ignored for non-fork skills (which don't run as a sub-session).

Behaviour

  • Override is applied for the duration of the sub-session and the agent's previous model is restored when the skill completes.
  • If the model reference cannot be resolved (unknown name, missing credentials, runtime not configured for model switching, …) the skill falls back to the agent's currently-active model and a warning is logged.
  • If the user switches the model via the TUI model picker while a fork skill is running, their choice is preserved — the deferred restore detects the concurrent change and is a no-op.

Implementation notes

The save/restore around the sub-session uses pointer-identity compare-and-swap on the agent's atomic.Pointer[[]provider.Provider]:

  • Agent.SetModelOverride(...) now returns a ModelOverrideSnapshot of the value just stored.
  • LocalRuntime.WithAgentModel(ctx, name, ref) (restore func(), err error) is the canonical entry point: it captures prev, applies the override, and returns a restore closure that CAS-swaps prev back only if the override has not been mutated since.
  • The post-apply snapshot is captured atomically with the store (returned by SetModelOverride), so there is no race window where a concurrent SetModelOverride could be misattributed to the skill's scope.
  • The returned restore func is always non-nil — on error it is a no-op closure — so callers can defer restore() unconditionally.

Files

  • pkg/skills/skills.goSkill.Model field + frontmatter parsing
  • pkg/agent/agent.goModelOverrideSnapshot, SnapshotModelOverride, RestoreModelOverride; SetModelOverride returns the snapshot of the stored value
  • pkg/runtime/model_switcher.goWithAgentModel helper + setAgentModelInternal threading the snapshot through every branch
  • pkg/runtime/skill_runner.go — apply/restore around the sub-session
  • docs/features/skills/index.md — new model field documented + example

Tests

Race-detector unit tests cover:

  • frontmatter parsing for both named-model and provider/model forms
  • Agent.SetModelOverride returning a snapshot of the stored pointer (not load-after-store)
  • Snapshot/RestoreModelOverride happy path, restore-to-pre-existing-override, concurrent-change-preserved, concurrent-clear-preserved
  • WithAgentModel for: agent-not-found, nil model switcher, invalid ref, apply+restore, idempotent restore, concurrent change preserved

Validated with go build, go vet, golangci-lint (0 issues), and the full mise test suite under -race.

Commits

The branch is structured so every commit is independently reviewable:

  1. feat(skills): allow fork skills to override the model — adds the field and the simplest possible apply/restore
  2. refactor(runtime): reuse SetAgentModel for skill model overrides — drops a duplicated resolution helper
  3. fix(skills): preserve concurrent model picks via CAS restore — introduces the snapshot/CAS-restore primitive
  4. refactor(runtime): introduce WithAgentModel, drop dead ModelOverrides — single-call helper, removes dead API
  5. fix(runtime): always-non-nil restore from WithAgentModel + add tests — caller-side ergonomics + coverage
  6. fix(runtime): close race window in WithAgentModel snapshot capture — atomically captures the post-apply snapshot inside SetModelOverride

Assisted-By: docker-agent

dgageot added 6 commits April 26, 2026 09:38
Fork skills (context: fork) previously inherited the parent agent's
model with no way to switch. Add a model field to SKILL.md frontmatter
that overrides the model used while the sub-session runs and is
restored when the skill completes.

- Parse `model:` from SKILL.md frontmatter into Skill.Model.
- Add Agent.ModelOverrides() returning a defensive copy so callers
  can save and restore the active override around a sub-session.
- In handleRunSkill, resolve the skill's model ref (named, inline
  provider/model, or comma-separated alloy) via the runtime's model
  switcher, apply it as the agent's override, and defer restoration.
  Falls back to the parent's model with a warning if resolution fails
  or the runtime has no model switcher configured.
- Document the new field and add an example in the skills guide.
- Cover parsing and the not-configured runtime path with tests, and
  extend TestModelOverride for the new ModelOverrides() getter and
  save/restore round-trip.

Assisted-By: docker-agent
The applySkillModelOverride helper duplicated a less complete subset
of the resolution logic already implemented by SetAgentModel (named
model, alloy from config, inline provider/model, inline alloy spec).

Replace the 40-line helper with a 10-line save/restore around a direct
SetAgentModel call: capture the previous override via ModelOverrides(),
apply the new one, and on success defer SetModelOverride(prev...).
SetModelOverride filters nil providers and clears the override when
the resulting slice is empty, so the same call handles both "restore
previous override" and "restore default" without an explicit branch.

This drops the now-unused agent / provider / errors / strings imports
and the standalone helper test (the no-switcher error path is already
covered by SetAgentModel itself).

Assisted-By: docker-agent
The previous save/restore around a fork-skill sub-session was racy
against the TUI model picker: a user switching model mid-skill had
their choice silently reverted by the deferred restore.

Replace the naive ModelOverrides() snapshot + SetModelOverride(prev...)
restore with a pointer-identity compare-and-swap:

- Add an opaque ModelOverrideSnapshot type plus SnapshotModelOverride
  and RestoreModelOverride methods on Agent. RestoreModelOverride
  CAS-swaps on the underlying atomic.Pointer, so the restore is a
  no-op if any other caller mutated the override since it was
  captured.
- handleRunSkill now snapshots before and after applying the skill's
  override and defers a CAS-restore. If the user switches model via
  the TUI (or any other caller wins the race), their choice is kept
  and the skill's override scope ends cleanly.
- Document that ModelOverrides() is for read-only inspection only and
  that the snapshot/restore primitive must be used for scoped
  overrides.
- Clarify the docs: fallback is to the agent's currently-active
  model (configured default OR a previously-set override), and
  concurrent TUI model switches during a fork skill are preserved.

Validated with race-detector tests covering the happy path, restore
to a pre-existing override, and both concurrent-change and
concurrent-clear scenarios.

Assisted-By: docker-agent
Two follow-up simplifications now that the skill model-override feature
is in place.

- Encapsulate the snapshot+SetAgentModel+snapshot+restore-closure idiom
  in a new LocalRuntime.WithAgentModel(ctx, name, ref) (restore, err)
  helper. The skill runner shrinks from a dozen lines (with two manual
  SnapshotModelOverride calls and a hand-written defer) to a 6-line
  if/else: get a restore closure or log and skip. The CAS semantics
  live in one place.

- Drop the now-unused Agent.ModelOverrides() getter. It was added for
  the original naive save/restore in handleRunSkill, but the CAS
  refactor moved off slice-copy save/restore. The only remaining
  caller was the unit test, which now asserts via Model().ID() and
  HasModelOverride() — same coverage with less ceremony. Snapshot /
  restore is still tested by TestSnapshotAndRestoreModelOverride.

No feature change. Validated with build, vet, golangci-lint, and the
race-detector test suite for pkg/agent, pkg/runtime, pkg/skills,
pkg/tools/builtin, pkg/tools/builtin/agent.

Assisted-By: docker-agent
Address two reviewer findings on the WithAgentModel helper.

- Make the returned restore closure always non-nil. On error it is a
  no-op closure; on success it performs the CAS-restore. Callers can
  now defer restore() unconditionally without nil-checking, removing
  a footgun for future callers. The skill runner is simplified
  accordingly: a single defer restore() right after the call instead
  of an if/else around it.

- Add dedicated unit tests for WithAgentModel covering:
    * agent not found        — error + safe no-op restore
    * nil modelSwitcherCfg   — error + safe no-op restore + agent untouched
    * invalid model ref      — error + safe no-op restore + agent untouched
    * apply clears existing override; restore puts it back
    * restore is idempotent  — second call is a CAS no-op
    * concurrent change is preserved by restore (TUI scenario at the
      runtime layer)

The "apply" tests deliberately use the empty-string model ref form,
which short-circuits inside SetAgentModel to a.SetModelOverride() and
therefore needs no provider resolution. This keeps the tests hermetic
while still exercising the snapshot+apply+CAS-restore composition.

Validated with go build, go vet, golangci-lint (0 offenses), and the
full pkg/agent + pkg/runtime + pkg/skills + pkg/tools/builtin test
suites under -race.

Assisted-By: docker-agent
Reviewer flagged a real (if narrow) race in WithAgentModel:

    prev := a.SnapshotModelOverride()
    if err := r.SetAgentModel(ctx, name, ref); err != nil { ... }
    ours := a.SnapshotModelOverride()  // <-- race window here

Between SetAgentModel returning and the post-apply snapshot being
captured, a concurrent caller (TUI model picker, etc.) could store its
own override. `ours` would then refer to that caller's pointer instead
of the one we just stored, and the deferred CAS-restore would
incorrectly succeed and clobber the user's choice.

Fix the race by capturing the snapshot atomically with the store
itself:

- SetModelOverride now returns ModelOverrideSnapshot — the pointer it
  just stored. Existing callers that don't care about the snapshot
  ignore the return value (Go allows this), so no source changes are
  needed beyond the agent itself.
- Refactor SetAgentModel into a thin wrapper over a new
  setAgentModelInternal that returns (snapshot, error). Each branch
  now threads the snapshot from a.SetModelOverride(...) through the
  return.
- WithAgentModel no longer calls SnapshotModelOverride after the
  apply: it uses the snapshot returned by setAgentModelInternal,
  which holds exactly the pointer the agent stored, regardless of
  any concurrent SetModelOverride call. The CAS-restore now correctly
  fails (preserving the concurrent change) instead of incorrectly
  succeeding.

Tests:
- TestSetModelOverride_ReturnsSnapshotOfStoredValue proves the new
  property: a concurrent SetModelOverride between our store and a
  later restore is preserved, because oursSnap holds the pointer we
  stored, not the load-after-store value.
- TestSetModelOverride_ClearReturnsZeroSnapshot verifies the
  clear-and-restore round-trip via the returned snapshot.

Validated with go build, go vet, golangci-lint (0 issues), and the
race-detector test suite for pkg/agent, pkg/runtime, pkg/skills,
pkg/tools/builtin, pkg/tools/builtin/agent, and the full mise test
suite.

Assisted-By: docker-agent
@dgageot dgageot requested a review from a team as a code owner April 27, 2026 11:22
@dgageot dgageot merged commit 502a990 into docker:main Apr 27, 2026
9 checks passed
dgageot added a commit that referenced this pull request Apr 27, 2026
The merge of #2524 (move fork-skill validation into SkillsToolset) on
top of #2525 (skill model-override) left handleRunSkill referencing
identifiers that no longer exist in its scope: skill.Model and
params.Name. The refactor renamed params -> args and dropped the
local skill variable in favour of a PreparedSkillFork bundle, but
the model-override block introduced by #2525 was not updated, so
package github.com/docker/docker-agent/pkg/runtime failed to build.

Restore the build by surfacing the model override through the
toolset, in line with the refactor's intent of keeping skill-specific
business rules out of the runtime.

- Add Model to PreparedSkillFork and populate it from skill.Model
  inside (*SkillsToolset).PrepareForkSubSession, so the runtime no
  longer needs a direct skills.Skill reference.
- Switch handleRunSkill's override block to prepared.Model and
  prepared.SkillName; the WithAgentModel call, defer restore(), and
  warning log are otherwise unchanged.
- Extend the PrepareForkSubSession happy-path test to assert the
  Model field is propagated, and add a _NoModelOverride case that
  pins down the empty-Model contract the runtime relies on to skip
  the override path.

Validated with go build, go vet, golangci-lint (0 issues), and
go test on pkg/runtime, pkg/tools/builtin, pkg/skills, and pkg/agent.

Assisted-By: docker-agent
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants