Skip to content

Misc. bug: llama-cli --completion-bash wont work without a model as argument #17973

@teto

Description

@teto

Name and Version

./result/bin/llama-cli --version
version: 7342 (2fbe3b7)
built with GNU 14.3.0 for Linux x86_64

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-cli

Command line

./result/bin/llama-cli --completion-bash                                                               
error: --model is required


Indeed:

./result/bin/llama-cli -m /home/teto/llama-models/Ministral-3-14B-Base-2512.Q6_K.gguf --completion-bash

triggers the bash generation.

Problem description & steps to reproduce

I noticed llama-server had --completion-bash so I tried to add completion support to my linux distrib than I thought other executables might support it as well.

So I tried adding it to llama-cli. But then I noticed that the generated autocmplete seemed to be shareable among programs via:

complete -F _llama_completions llama-batched
complete -F _llama_completions llama-batched-bench
complete -F _llama_completions llama-bench
complete -F _llama_completions llama-cli
complete -F _llama_completions llama-convert-llama2c-to-ggml
complete -F _llama_completions llama-cvector-generator
complete -F _llama_completions llama-embedding
complete -F _llama_completions llama-eval-callback
complete -F _llama_completions llama-export-lora
complete -F _llama_completions llama-gen-docs
complete -F _llama_completions llama-gguf
....

I ran a diff between llama-cli --completion-bash and llama-server --completion-bash to make sure and there seems to be a slight diff:

➜ diff llama-cli.completion.bash ./result/share/bash-completion/completions/llama-server.bash 
7c7
<     opts="-h --help --usage --version -cl --cache-list --completion-bash --verbose-prompt -t --threads -tb --threads-batch -C --cpu-mask -Cr --cpu-range --cpu-strict --prio --poll -Cb --cpu-mask-batch -Crb --cpu-range-batch --cpu-strict-batch --prio-batch --poll-batch -c --ctx-size -n --predict --n-predict -b --batch-size -ub --ubatch-size --keep --swa-full --kv-unified -kvu -fa --flash-attn -p --prompt --no-perf -f --file -bf --binary-file -e --escape --no-escape --rope-scaling --rope-scale --rope-freq-base --rope-freq-scale --yarn-orig-ctx --yarn-ext-factor --yarn-attn-factor --yarn-beta-slow --yarn-beta-fast -nkvo --no-kv-offload -nr --no-repack --no-host -ctk --cache-type-k -ctv --cache-type-v -dt --defrag-thold -np --parallel --mlock --no-mmap --numa -dev --device --list-devices --override-tensor -ot --cpu-moe -cmoe --n-cpu-moe -ncmoe -ngl --gpu-layers --n-gpu-layers -sm --split-mode -ts --tensor-split -mg --main-gpu --check-tensors --override-kv --no-op-offload --lora --lora-scaled --control-vector --control-vector-scaled --control-vector-layer-range -m --model -mu --model-url -dr --docker-repo -hf -hfr --hf-repo -hfd -hfrd --hf-repo-draft -hff --hf-file -hfv -hfrv --hf-repo-v -hffv --hf-file-v -hft --hf-token --log-disable --log-file --log-colors -v --verbose --log-verbose --offline -lv --verbosity --log-verbosity --log-prefix --log-timestamps -ctkd --cache-type-k-draft -ctvd --cache-type-v-draft --samplers -s --seed --sampling-seq --sampler-seq --ignore-eos --temp --top-k --top-p --min-p --top-nsigma --xtc-probability --xtc-threshold --typical --repeat-last-n --repeat-penalty --presence-penalty --frequency-penalty --dry-multiplier --dry-base --dry-allowed-length --dry-penalty-last-n --dry-sequence-breaker --dynatemp-range --dynatemp-exp --mirostat --mirostat-lr --mirostat-ent -l --logit-bias --grammar --grammar-file -j --json-schema -jf --json-schema-file --no-display-prompt -co --color --no-context-shift --context-shift -sys --system-prompt -sysf --system-prompt-file -ptc --print-token-count --prompt-cache --prompt-cache-all --prompt-cache-ro -r --reverse-prompt -sp --special -cnv --conversation -no-cnv --no-conversation -st --single-turn -i --interactive -if --interactive-first -mli --multiline-input --in-prefix-bos --in-prefix --in-suffix --no-warmup -gan --grp-attn-n -gaw --grp-attn-w --jinja --no-jinja --reasoning-format --reasoning-budget --chat-template --chat-template-file --simple-io "
---
>     opts="-h --help --usage --version -cl --cache-list --completion-bash --verbose-prompt -t --threads -tb --threads-batch -C --cpu-mask -Cr --cpu-range --cpu-strict --prio --poll -Cb --cpu-mask-batch -Crb --cpu-range-batch --cpu-strict-batch --prio-batch --poll-batch -c --ctx-size -n --predict --n-predict -b --batch-size -ub --ubatch-size --keep --swa-full --kv-unified -kvu -fa --flash-attn --no-perf -e --escape --no-escape --rope-scaling --rope-scale --rope-freq-base --rope-freq-scale --yarn-orig-ctx --yarn-ext-factor --yarn-attn-factor --yarn-beta-slow --yarn-beta-fast -nkvo --no-kv-offload -nr --no-repack --no-host -ctk --cache-type-k -ctv --cache-type-v -dt --defrag-thold -np --parallel --mlock --no-mmap --numa -dev --device --list-devices --override-tensor -ot --cpu-moe -cmoe --n-cpu-moe -ncmoe -ngl --gpu-layers --n-gpu-layers -sm --split-mode -ts --tensor-split -mg --main-gpu --check-tensors --override-kv --no-op-offload --lora --lora-scaled --control-vector --control-vector-scaled --control-vector-layer-range -m --model -mu --model-url -dr --docker-repo -hf -hfr --hf-repo -hfd -hfrd --hf-repo-draft -hff --hf-file -hfv -hfrv --hf-repo-v -hffv --hf-file-v -hft --hf-token --log-disable --log-file --log-colors -v --verbose --log-verbose --offline -lv --verbosity --log-verbosity --log-prefix --log-timestamps -ctkd --cache-type-k-draft -ctvd --cache-type-v-draft --samplers -s --seed --sampling-seq --sampler-seq --ignore-eos --temp --top-k --top-p --min-p --top-nsigma --xtc-probability --xtc-threshold --typical --repeat-last-n --repeat-penalty --presence-penalty --frequency-penalty --dry-multiplier --dry-base --dry-allowed-length --dry-penalty-last-n --dry-sequence-breaker --dynatemp-range --dynatemp-exp --mirostat --mirostat-lr --mirostat-ent -l --logit-bias --grammar --grammar-file -j --json-schema -jf --json-schema-file --ctx-checkpoints --swa-checkpoints --cache-ram -cram --no-context-shift --context-shift -r --reverse-prompt -sp --special --no-warmup --spm-infill --pooling -cb --cont-batching -nocb --no-cont-batching --mmproj --mmproj-url --no-mmproj --no-mmproj-offload --image-min-tokens --image-max-tokens --override-tensor-draft -otd --cpu-moe-draft -cmoed --n-cpu-moe-draft -ncmoed -a --alias --host --port --path --api-prefix --no-webui --embedding --embeddings --reranking --rerank --api-key --api-key-file --ssl-key-file --ssl-cert-file --chat-template-kwargs -to --timeout --threads-http --cache-reuse --metrics --props --slots --no-slots --slot-save-path --media-path --models-dir --models-max --no-models-autoload --jinja --no-jinja --reasoning-format --reasoning-budget --chat-template --chat-template-file --no-prefill-assistant -sps --slot-prompt-similarity --lora-init-without-apply -td --threads-draft -tbd --threads-batch-draft --draft-max --draft --draft-n --draft-min --draft-n-min --draft-p-min -cd --ctx-size-draft -devd --device-draft -ngld --gpu-layers-draft --n-gpu-layers-draft -md --model-draft --spec-replace -mv --model-vocoder --tts-use-guide-tokens --embd-gemma-default --fim-qwen-1.5b-default --fim-qwen-3b-default --fim-qwen-7b-default --fim-qwen-7b-spec --fim-qwen-14b-spec --fim-qwen-30b-default --gpt-oss-20b-default --gpt-oss-120b-default --vision-gemma-4b-default --vision-gemma-12b-default "

First Bad Commit

No response

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions