Skip to content

Conversation

@simonferquel
Copy link
Contributor

Summary

Refactors session message handling to improve LLM prompt caching by categorizing system messages based on caching characteristics.

Changes

  • Split system messages into three categories: invariant (cacheable globally), context-specific (cacheable per user/project), and session summaries
  • Added strategic cache control markers at category boundaries
  • New tests for cache control behavior

Benefits

  • Better cache hit rates → lower API costs and faster responses
  • Fully backward compatible

Technical

Replaced monolithic GetMessages() with three focused functions for different message categories, enabling optimal cache utilization at multiple granularities.

@simonferquel simonferquel requested a review from a team as a code owner January 16, 2026 15:34
@simonferquel
Copy link
Contributor Author

/review

@dgageot dgageot merged commit ef02c5c into docker:main Jan 16, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants