-
Notifications
You must be signed in to change notification settings - Fork 14.1k
cli: new CLI experience #17824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cli: new CLI experience #17824
Conversation
|
When we read a file, maybe we can start processing the prompt immediately? Just post a task with |
This comment was marked as outdated.
This comment was marked as outdated.
|
Ah nevermind, I see what you mean. Hmm yeah that could also be a good idea |
|
@ngxson great work on this. You're lighting fast ⚡️!! 😊 |
|
I hope that llama-completion will remain available long term. It’s not possible to output to a file with this new CLI experience, or do raw completions, as well as a bunch of other things. Not everyone wants a chat experience. Would also be super helpful if llama-completion was documented somewhere. Thank you for your work. |
|
Well, we are missing Meanwhile, there is no way to disable this |
|
Hello @ngxson, what does it mean |
|
Friendly reminder that this is an open-source project and missing features can be added by contributors. I won't comment further on missing features, there are already TODOs in the code for this purpose. |
|
I don't want to offend anyone, but I can just predict that |
|
@andrew-aladev please see my comments on the referenced issue. It is fine to reference this PR in issues but it is best to keep the conversation in those issues. It is good that folks speak up if there's a need and file bug reports. These changes are needed to move forward because of technical debt in llama-cli that built up during evolution of the capabilities. If there are any further issues regarding this please copy me on there and I will work on cataloging user needs. |
That is obviously an oversight that needs to be fixed, |
|
Guys (@andrew-aladev, @MB7979, others who complain first-listen later); If you have the time to post here (multiple posts, even), you certainly have the time for due diligence, E.G.!! #17618, which starts off with This kind of thing is why open source maintainers[1] lose all of their hair, drop out, and decide to go to nursing school. [1] Yes, I am one. |
That message was edited to add that clarification 30 minutes ago. So you are chastising us for not finding something that didn’t exist when I posted and is in fact a direct response to said questions being raised. For what it’s worth I did check PRs, issues, and discussions yesterday. I missed that discussion (it’s a few weeks old) and as soon as I found it I took my feedback there. |
|
@MB7979 Didn't you ignore this message that was added for a whole 2 weeks ago? LOG_WRN("*****************************\n");
LOG_WRN("IMPORTANT: The current llama-cli will be moved to llama-completion in the near future\n");
LOG_WRN(" New llama-cli will have enhanced features and improved user experience\n");
LOG_WRN(" More info: https://github.com/ggml-org/llama.cpp/discussions/17618\n");
LOG_WRN("*****************************\n");If you missed it, you have just implicitly proved that |
|
I had not updated llama.cpp for a few weeks. I only did so on reading the new cli commit, as I was concerned the old functionality would be removed, which it was. I really don’t understand the defensive tone being taken here. I’m not sure about the other poster, but my only intent was to ascertain whether llama.completion would be an ongoing part of the project, and to suggest some documentation to go with the changes to avoid people like me wasting your time with such questions. I take it feedback is unwelcome here and I will not participate further. |
I am just speaking the truth.
Then what is your suggestion? There was already a discussion and a notice inside
If we do have had better documentation, what will prevent you from missing it again? (Again, I'm just speaking the truth here) |
|
Include it with all the other examples, listed on the front page of this repository. |
you said earlier:
probably fair, I haven't touched the main README.md for a long time - even the last 2 big changes in llama-server and llama-cli weren't on the the list |
|
@MB7979 TBH I was much more addressing @andrew-aladev than you. Apologies if I offended; I knew about these changes at least from the beginning of this week, but I was mistaken in how I knew 😬 (not unusual for me :) ) |
|
I think for many folks who rely on llama-cli old features for now, no need to panic: llama-completions isn't going anywhere. It is the same old (legacy) llama-cli application with a new name. Just because it is deemed "legacy" does not mean it will be deleted 🙂 Lots of changes happen quickly in this codebase compared to other projects because of the interest in AI. It is sometimes hard to track everything happening, but it is important for everyone to try their best. I will be creating a discussion in the following week to establish some of the user journeys and see if I can't come up with a roadmap of sorts. |
* wip * wip * fix logging, add display info * handle commands * add args * wip * move old cli to llama-completion * rm deprecation notice * move server to a shared library * move ci to llama-completion * add loading animation * add --show-timings arg * add /read command, improve LOG_ERR * add args for speculative decoding, enable show timings by default * add arg --image and --audio * fix windows build * support reasoning_content * fix llama2c workflow * color default is auto * fix merge conflicts * properly fix color problem Co-authored-by: bandoti <[email protected]> * better loading spinner * make sure to clean color on force-exit * also clear input files on "/clear" * simplify common_log_flush * add warning in mtmd-cli * implement console writter * fix data race * add attribute * fix llama-completion and mtmd-cli * add some notes about console::log * fix compilation --------- Co-authored-by: bandoti <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/ggml-org/llama.cpp/blame/master/tools/llama-bench/README.md has an outdated link to main/README.md, IMHO it should be updated too? Happy to help if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure that would be great. If you find any broken links like that please feel free to submit a PR. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a PR #17993 it fixes dead links including link in llama-bench
That's great!
Notable changes: - Fix race conditions in threadpool (ggml-org#17748) - New CLI experience (ggml-org#17824) - Vision model improvements (clip refactor, new models) - Performance fixes (CUDA MMA, Vulkan improvements) - tools/main renamed to tools/completion Conflict resolution: - ggml-cpu.c: Use new threadpool->n_threads API (replaces n_threads_max), keep warning suppressed to reduce log noise 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Ref: #17618
Fix: #11202
We are moving to a new CLI experience with the main code built on top of
llama-server. This brings many additional features intollama-cli, making the experience feels mostly like a smaller version of the web UI:llama-clidoesn't support)TODO:
reasoning_contentwhen possible, depends on server: delegate result_state creation to server_task #17835console::readline#17828console::readline#17829--imageand--audioargumentsllama-completionfor legacy featuresFeatures planned for next versions:
TODO for console system: