llama.cpp

Name: llama.cpp
Author: Ggml

llama.cpp is an efficient C++ implementation leveraging the GGML library to run large language model (LLM) inference locally, focusing on CPU efficiency with SIMD support and optional GPU acceleration.

Latest: 9102 Winget

Last checked: Jun 9, 2026 12:10am

Rank: 243/15140

Also monitored via:

GitHub Releases Site Monitor

Follow to track new versions in your feed.

Overview

License: MIT LicenseWinget: Available

Product Page Support Page Download Page

Version & Lifecycle

Current: 9102 N-2: 9085 Avg cadence: Every 1 day

Top Contributors

Top sitewide contributors:

View full leaderboard →

Community Notes

Deployment note • May 15, 2026

llama.cpp HTTP server deployment note

For managed llama.cpp deployments that expose a local model endpoint, the official server tool can run a GGUF model with an explicit bind address and port, for example llama-server -m models\model.gguf --host 127.0.0.1 --port 8080. Use a non-loopback host only when you intentionally want other systems to reach the service.

The project documents OpenAI-compatible chat-completions, responses, and embeddings routes in the HTTP server, so endpoint tests can validate both the Windows process and the API surface after packaging. Source: official llama.cpp HTTP server documentation.

Release Notes & Updates

Avg cadence: —

Next anticipated release: —

Updates • 0

Help us match vulnerabilities

No vulnerability match yet. Pick the right product:

Looking for matching products…

Don’t see it? Paste a CPE

Also known as

Other names people use for this app — helps search and matching.

llamacppggml llamacpp

Packaging Notes

Build from source using make; supports CPU and GPU backends (CUDA, Metal). Uses GGML for efficient tensor operations.

Notes

Requires model weights converted to GGML or GGUF format. Supports quantized models for faster inference and lower memory usage.

llama.cpp

Overview

Version & Lifecycle

Tags

Top Contributors

Community Notes

llama.cpp HTTP server deployment note

Release Notes & Updates

Help us match vulnerabilities

Also known as

Packaging Notes

Notes

Contribute a Community Note

Report an Issue

Explore

Account