Skip to content

fix: preserve tool calls when thinking models return no text content#11866

Merged
RomneyDa merged 2 commits intomainfrom
qwen3-agent
Mar 26, 2026
Merged

fix: preserve tool calls when thinking models return no text content#11866
RomneyDa merged 2 commits intomainfrom
qwen3-agent

Conversation

@RomneyDa
Copy link
Copy Markdown
Contributor

@RomneyDa RomneyDa commented Mar 26, 2026

Summary

When Qwen3-Coder (or similar thinking models) returns thinking content + tool calls but no text content via Ollama, the early return in convertChatMessage only yielded the thinking message, silently discarding tool calls. This caused the agent spinner to never stop and tools to never execute.

Fix: Add !toolCalls?.length to the early return condition so tool calls are preserved even when thinking is present without text content.

Note: The vLLM users in #8744 see a server-side Python error (list index out of range) which is not fixable client-side — they likely need --tool-call-parser configured on vLLM.

Fixes #8744 (Ollama users)

Also related:

Test plan

  • Manual test with Qwen3-Coder via Ollama with tool calling enabled — verify tools execute and agent completes

…thout text

Two bugs caused Qwen3-Coder (and similar thinking models) to silently
drop tool calls:

1. Ollama: When the model produces thinking content and tool calls but
   no text content, the early return in convertChatMessage only yielded
   the thinking message, discarding tool calls entirely.

2. OpenAI/vLLM: fromChatCompletionChunk used an if/else chain where
   content was checked before tool_calls, so any chunk with both fields
   would lose its tool calls.

Fixes #8744
@RomneyDa RomneyDa requested a review from a team as a code owner March 26, 2026 00:31
@RomneyDa RomneyDa requested review from sestinj and removed request for a team March 26, 2026 00:31
@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Mar 26, 2026
@continue
Copy link
Copy Markdown
Contributor

continue bot commented Mar 26, 2026

Docs Review: No documentation updates needed.

This PR fixes an internal bug in the LLM streaming logic for thinking models (Ollama and OpenAI/vLLM providers). The changes ensure tool calls are preserved when thinking models return thinking content without text content. These are implementation-level fixes that don't affect any user-facing APIs, configuration options, or documented behavior—users will simply see tool calling work correctly in edge cases where it was previously broken.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

The vLLM "list index out of range" error is server-side (Python), not
fixable by reordering client-side chunk parsing. Keep only the Ollama
fix which addresses the confirmed bug.
@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Mar 26, 2026
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Mar 26, 2026
@github-project-automation github-project-automation bot moved this from Todo to In Progress in Issues and PRs Mar 26, 2026
@RomneyDa RomneyDa merged commit 3a997fe into main Mar 26, 2026
119 of 124 checks passed
@RomneyDa RomneyDa deleted the qwen3-agent branch March 26, 2026 04:11
@github-project-automation github-project-automation bot moved this from In Progress to Done in Issues and PRs Mar 26, 2026
@github-actions github-actions bot locked and limited conversation to collaborators Mar 26, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

lgtm This PR has been approved by a maintainer size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Fail to call Qwen3-Coder model with tool calling enabled

2 participants