Skip to content

Misc. bug: Misc. bug: llama-speculative-simple not working on Nvidia with Vulkan after b7063 #17957

@nikomartn

Description

@nikomartn

Name and Version

version: 7063 (38eaf32)
built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu

Operating systems

Linux, Windows

Which llama.cpp modules do you know to be affected?

Other (Please specify in the next section)

Command line

export CMAKE_GENERATOR=Ninja
export VK_INSTANCE_LAYERS=VK_LAYER_KHRONOS_validation
export VK_LOADER_DEBUG=all
export GGML_VK_DEBUG=1
export GGML_VK_VERBOSE=1

rm -rf build
cmake -B build -DGGML_VULKAN=ON
cmake --build build --config Release

./build/bin/llama-speculative-simple \
    --model models2/Qwen3-4B-Instruct-2507-Q4_K_M.gguf \
    --model-draft models2/Qwen3-0.6B-Q4_K_M.gguf \
    --prompt "Provide a long explanation about what quantum computing is, in simple terms and with examples"

Problem description & steps to reproduce

We actually noticed this on our own implementation of speculative decoding written on Rust with llama-cpp-2, but it is reproducible here with the llama example, having the same behavior.

When the example reaches the loop and passes the tokens to the target model on batch: llama_decode(ctx_tgt, batch_tgt);. The code freezes on Linux, on Windows it does recover with a (exit code: 0xc0000409, STATUS_STACK_BUFFER_OVERRUN).

From the logs, I gather that the command buffer is in an inconsistent state.

This has only happened with the Vulkan backend on Nvidia gpus, running successfully when using just CPU or the CUDA backend, or Vulkan on integrated intel GPUs.

First Bad Commit

38eaf32
b7062...b7063

Relevant log output

Validation Error: [ VUID-vkEndCommandBuffer-commandBuffer-00059 ] | MessageID = 0xdf9fb6be
vkEndCommandBuffer(): was called in VkCommandBuffer 0x57f0aaa09bc0 which is invalid because the bound VkDeviceMemory 0x140000000014 was destroyed.
The Vulkan spec states: commandBuffer must be in the recording state (https://vulkan.lunarg.com/doc/view/1.4.313.0/linux/antora/spec/latest/chapters/cmdbuffers.html#VUID-vkEndCommandBuffer-commandBuffer-00059)
Objects: 3
    [0] VkDeviceMemory 0x140000000014
    [1] VkBuffer 0x130000000013
    [2] VkCommandBuffer 0x57f0aaa09bc0

Metadata

Metadata

Assignees

No one assigned

    Labels

    VulkanIssues specific to the Vulkan backendbugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions