Skip to content

Short sequences of numbers can cause extremely long repetitive inference #412

@RndyP

Description

@RndyP

Platform: Windows C++ app built with VS2022. My PC is a Dell laptop with quad core i5.

Pass a 3 second audio clip of the word "six" 3 or four times, and the return can take up to a minute of CPU time and sometimes include odd gibberish.

Here is an example. I am speaking "six,six,six" as clear as I can, and am sending the audio buffer to Whisper. The lines labeled "erase" are simply silence in my audio buffer, and are not sent to whisper. The lines with the timings in seconds are Whisper processing approximately 3 second chunks of "six,six,six":
image
As you can see, there are 2 correct inferences there, 11 and 17 seconds. The others take quite a bit of time, and one has a bit of gibberish at the end. I have seen longer strings of gibberish and longer times also. Here's an 80 second CPU grind:
image

Here's my init parms:

// get default Whisper parameters
m_params = whisper_full_default_params(WHISPER_SAMPLING_GREEDY);

// overrides
m_params.print_progress   = false;
m_params.print_timestamps = false;
m_params.no_context       = true;
m_params.single_segment   = true;
m_params.max_tokens       = 0;		// no limit

char BinFilename[] ="ggml-tiny.en.bin";
m_ctx = whisper_init_from_file(BinFilename);

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions