-
Notifications
You must be signed in to change notification settings - Fork 14.1k
Closed
Closed
Copy link
Labels
Description
Name and Version
./result/bin/llama-cli --version
version: 7342 (2fbe3b7)
built with GNU 14.3.0 for Linux x86_64
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-cli
Command line
./result/bin/llama-cli --completion-bash
error: --model is required
Indeed:
./result/bin/llama-cli -m /home/teto/llama-models/Ministral-3-14B-Base-2512.Q6_K.gguf --completion-bash
triggers the bash generation.Problem description & steps to reproduce
I noticed llama-server had --completion-bash so I tried to add completion support to my linux distrib than I thought other executables might support it as well.
So I tried adding it to llama-cli. But then I noticed that the generated autocmplete seemed to be shareable among programs via:
complete -F _llama_completions llama-batched
complete -F _llama_completions llama-batched-bench
complete -F _llama_completions llama-bench
complete -F _llama_completions llama-cli
complete -F _llama_completions llama-convert-llama2c-to-ggml
complete -F _llama_completions llama-cvector-generator
complete -F _llama_completions llama-embedding
complete -F _llama_completions llama-eval-callback
complete -F _llama_completions llama-export-lora
complete -F _llama_completions llama-gen-docs
complete -F _llama_completions llama-gguf
....
I ran a diff between llama-cli --completion-bash and llama-server --completion-bash to make sure and there seems to be a slight diff:
➜ diff llama-cli.completion.bash ./result/share/bash-completion/completions/llama-server.bash
7c7
< opts="-h --help --usage --version -cl --cache-list --completion-bash --verbose-prompt -t --threads -tb --threads-batch -C --cpu-mask -Cr --cpu-range --cpu-strict --prio --poll -Cb --cpu-mask-batch -Crb --cpu-range-batch --cpu-strict-batch --prio-batch --poll-batch -c --ctx-size -n --predict --n-predict -b --batch-size -ub --ubatch-size --keep --swa-full --kv-unified -kvu -fa --flash-attn -p --prompt --no-perf -f --file -bf --binary-file -e --escape --no-escape --rope-scaling --rope-scale --rope-freq-base --rope-freq-scale --yarn-orig-ctx --yarn-ext-factor --yarn-attn-factor --yarn-beta-slow --yarn-beta-fast -nkvo --no-kv-offload -nr --no-repack --no-host -ctk --cache-type-k -ctv --cache-type-v -dt --defrag-thold -np --parallel --mlock --no-mmap --numa -dev --device --list-devices --override-tensor -ot --cpu-moe -cmoe --n-cpu-moe -ncmoe -ngl --gpu-layers --n-gpu-layers -sm --split-mode -ts --tensor-split -mg --main-gpu --check-tensors --override-kv --no-op-offload --lora --lora-scaled --control-vector --control-vector-scaled --control-vector-layer-range -m --model -mu --model-url -dr --docker-repo -hf -hfr --hf-repo -hfd -hfrd --hf-repo-draft -hff --hf-file -hfv -hfrv --hf-repo-v -hffv --hf-file-v -hft --hf-token --log-disable --log-file --log-colors -v --verbose --log-verbose --offline -lv --verbosity --log-verbosity --log-prefix --log-timestamps -ctkd --cache-type-k-draft -ctvd --cache-type-v-draft --samplers -s --seed --sampling-seq --sampler-seq --ignore-eos --temp --top-k --top-p --min-p --top-nsigma --xtc-probability --xtc-threshold --typical --repeat-last-n --repeat-penalty --presence-penalty --frequency-penalty --dry-multiplier --dry-base --dry-allowed-length --dry-penalty-last-n --dry-sequence-breaker --dynatemp-range --dynatemp-exp --mirostat --mirostat-lr --mirostat-ent -l --logit-bias --grammar --grammar-file -j --json-schema -jf --json-schema-file --no-display-prompt -co --color --no-context-shift --context-shift -sys --system-prompt -sysf --system-prompt-file -ptc --print-token-count --prompt-cache --prompt-cache-all --prompt-cache-ro -r --reverse-prompt -sp --special -cnv --conversation -no-cnv --no-conversation -st --single-turn -i --interactive -if --interactive-first -mli --multiline-input --in-prefix-bos --in-prefix --in-suffix --no-warmup -gan --grp-attn-n -gaw --grp-attn-w --jinja --no-jinja --reasoning-format --reasoning-budget --chat-template --chat-template-file --simple-io "
---
> opts="-h --help --usage --version -cl --cache-list --completion-bash --verbose-prompt -t --threads -tb --threads-batch -C --cpu-mask -Cr --cpu-range --cpu-strict --prio --poll -Cb --cpu-mask-batch -Crb --cpu-range-batch --cpu-strict-batch --prio-batch --poll-batch -c --ctx-size -n --predict --n-predict -b --batch-size -ub --ubatch-size --keep --swa-full --kv-unified -kvu -fa --flash-attn --no-perf -e --escape --no-escape --rope-scaling --rope-scale --rope-freq-base --rope-freq-scale --yarn-orig-ctx --yarn-ext-factor --yarn-attn-factor --yarn-beta-slow --yarn-beta-fast -nkvo --no-kv-offload -nr --no-repack --no-host -ctk --cache-type-k -ctv --cache-type-v -dt --defrag-thold -np --parallel --mlock --no-mmap --numa -dev --device --list-devices --override-tensor -ot --cpu-moe -cmoe --n-cpu-moe -ncmoe -ngl --gpu-layers --n-gpu-layers -sm --split-mode -ts --tensor-split -mg --main-gpu --check-tensors --override-kv --no-op-offload --lora --lora-scaled --control-vector --control-vector-scaled --control-vector-layer-range -m --model -mu --model-url -dr --docker-repo -hf -hfr --hf-repo -hfd -hfrd --hf-repo-draft -hff --hf-file -hfv -hfrv --hf-repo-v -hffv --hf-file-v -hft --hf-token --log-disable --log-file --log-colors -v --verbose --log-verbose --offline -lv --verbosity --log-verbosity --log-prefix --log-timestamps -ctkd --cache-type-k-draft -ctvd --cache-type-v-draft --samplers -s --seed --sampling-seq --sampler-seq --ignore-eos --temp --top-k --top-p --min-p --top-nsigma --xtc-probability --xtc-threshold --typical --repeat-last-n --repeat-penalty --presence-penalty --frequency-penalty --dry-multiplier --dry-base --dry-allowed-length --dry-penalty-last-n --dry-sequence-breaker --dynatemp-range --dynatemp-exp --mirostat --mirostat-lr --mirostat-ent -l --logit-bias --grammar --grammar-file -j --json-schema -jf --json-schema-file --ctx-checkpoints --swa-checkpoints --cache-ram -cram --no-context-shift --context-shift -r --reverse-prompt -sp --special --no-warmup --spm-infill --pooling -cb --cont-batching -nocb --no-cont-batching --mmproj --mmproj-url --no-mmproj --no-mmproj-offload --image-min-tokens --image-max-tokens --override-tensor-draft -otd --cpu-moe-draft -cmoed --n-cpu-moe-draft -ncmoed -a --alias --host --port --path --api-prefix --no-webui --embedding --embeddings --reranking --rerank --api-key --api-key-file --ssl-key-file --ssl-cert-file --chat-template-kwargs -to --timeout --threads-http --cache-reuse --metrics --props --slots --no-slots --slot-save-path --media-path --models-dir --models-max --no-models-autoload --jinja --no-jinja --reasoning-format --reasoning-budget --chat-template --chat-template-file --no-prefill-assistant -sps --slot-prompt-similarity --lora-init-without-apply -td --threads-draft -tbd --threads-batch-draft --draft-max --draft --draft-n --draft-min --draft-n-min --draft-p-min -cd --ctx-size-draft -devd --device-draft -ngld --gpu-layers-draft --n-gpu-layers-draft -md --model-draft --spec-replace -mv --model-vocoder --tts-use-guide-tokens --embd-gemma-default --fim-qwen-1.5b-default --fim-qwen-3b-default --fim-qwen-7b-default --fim-qwen-7b-spec --fim-qwen-14b-spec --fim-qwen-30b-default --gpt-oss-20b-default --gpt-oss-120b-default --vision-gemma-4b-default --vision-gemma-12b-default "
First Bad Commit
No response