-
Notifications
You must be signed in to change notification settings - Fork 14.1k
Description
Name and Version
version: 7063 (38eaf32)
built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu
Operating systems
Linux, Windows
Which llama.cpp modules do you know to be affected?
Other (Please specify in the next section)
Command line
export CMAKE_GENERATOR=Ninja
export VK_INSTANCE_LAYERS=VK_LAYER_KHRONOS_validation
export VK_LOADER_DEBUG=all
export GGML_VK_DEBUG=1
export GGML_VK_VERBOSE=1
rm -rf build
cmake -B build -DGGML_VULKAN=ON
cmake --build build --config Release
./build/bin/llama-speculative-simple \
--model models2/Qwen3-4B-Instruct-2507-Q4_K_M.gguf \
--model-draft models2/Qwen3-0.6B-Q4_K_M.gguf \
--prompt "Provide a long explanation about what quantum computing is, in simple terms and with examples"Problem description & steps to reproduce
We actually noticed this on our own implementation of speculative decoding written on Rust with llama-cpp-2, but it is reproducible here with the llama example, having the same behavior.
When the example reaches the loop and passes the tokens to the target model on batch: llama_decode(ctx_tgt, batch_tgt);. The code freezes on Linux, on Windows it does recover with a (exit code: 0xc0000409, STATUS_STACK_BUFFER_OVERRUN).
From the logs, I gather that the command buffer is in an inconsistent state.
This has only happened with the Vulkan backend on Nvidia gpus, running successfully when using just CPU or the CUDA backend, or Vulkan on integrated intel GPUs.
First Bad Commit
Relevant log output
Validation Error: [ VUID-vkEndCommandBuffer-commandBuffer-00059 ] | MessageID = 0xdf9fb6be
vkEndCommandBuffer(): was called in VkCommandBuffer 0x57f0aaa09bc0 which is invalid because the bound VkDeviceMemory 0x140000000014 was destroyed.
The Vulkan spec states: commandBuffer must be in the recording state (https://vulkan.lunarg.com/doc/view/1.4.313.0/linux/antora/spec/latest/chapters/cmdbuffers.html#VUID-vkEndCommandBuffer-commandBuffer-00059)
Objects: 3
[0] VkDeviceMemory 0x140000000014
[1] VkBuffer 0x130000000013
[2] VkCommandBuffer 0x57f0aaa09bc0