feat: detect context window limit errors for auto-compact #7

jhult · 2026-01-13T21:58:52Z

Add ContextLimitError class to detect when upstream API returns context window limit errors. The error response includes a suggest_compact flag that clients can use to trigger their own conversation compaction.

When a context window limit error is received from the upstream API, the proxy now automatically compacts and retries for non-streaming requests:

Generates a structured summary with 5 sections: Task Overview, Current State, Important Discoveries, Next Steps, Context to Preserve
Places summary as a single assistant message (replacing entire message history)
Increases max_tokens to 2000 for detailed summaries
Requests <summary></summary> tag wrapping
Retries the original request with the compacted conversation

This mimics client-side /compact command behavior, allowing conversations to continue seamlessly when hitting token limits.

Auto-compact retry only applies to non-streaming requests since streaming responses cannot be retried mid-stream.

Relates to dejay2#5

feat: detect context window limit errors for auto-compact

e141462

Relates to dejay2#5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: detect context window limit errors for auto-compact #7

feat: detect context window limit errors for auto-compact #7

jhult commented Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: detect context window limit errors for auto-compact #7

Are you sure you want to change the base?

feat: detect context window limit errors for auto-compact #7

Conversation

jhult commented Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant