Skip to content

fix(realtime): Better support for thinking models and setting model parameters#8595

Merged
mudler merged 4 commits intomudler:masterfrom
richiejp:fix/realtime-functions2
Feb 18, 2026
Merged

fix(realtime): Better support for thinking models and setting model parameters#8595
mudler merged 4 commits intomudler:masterfrom
richiejp:fix/realtime-functions2

Conversation

@richiejp
Copy link
Copy Markdown
Collaborator

  • fix(realtime): Wrap functions in OpenAI chat completions format
  • feat(realtime): Set max tokens from session object
  • fix(realtime): Find thinking start tag for thinking extraction
  • fix(realtime): Don't send buffer cleared message when we automatically drop it

Description

Various fixes for realtime mode, made while testing with a thinking model that uses llama-cpp's embedded tokenizer template

Notes for Reviewers

Signed commits

  • Yes, I signed my commits.

Signed-off-by: Richard Palethorpe <io@richiejp.com>
Signed-off-by: Richard Palethorpe <io@richiejp.com>
Signed-off-by: Richard Palethorpe <io@richiejp.com>
…y drop it

Signed-off-by: Richard Palethorpe <io@richiejp.com>
@netlify
Copy link
Copy Markdown

netlify bot commented Feb 18, 2026

Deploy Preview for localai ready!

Name Link
🔨 Latest commit fd2b0e5
🔍 Latest deploy log https://app.netlify.com/projects/localai/deploys/69958085c8092f0008a947ab
😎 Deploy Preview https://deploy-preview-8595--localai.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@mudler mudler enabled auto-merge (squash) February 18, 2026 09:32
@mudler mudler disabled auto-merge February 18, 2026 13:36
@mudler mudler added the bug Something isn't working label Feb 18, 2026
@mudler mudler merged commit 86b3bc9 into mudler:master Feb 18, 2026
38 checks passed
localai-bot pushed a commit to localai-bot/LocalAI that referenced this pull request Mar 25, 2026
…arameters (mudler#8595)

* fix(realtime): Wrap functions in OpenAI chat completions format

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat(realtime): Set max tokens from session object

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(realtime): Find thinking start tag for thinking extraction

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(realtime): Don't send buffer cleared message when we automatically drop it

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants