LinkedIn respects your privacy

LinkedIn and 3rd parties use essential and non-essential cookies to provide, secure, analyze and improve our Services, and to show you relevant ads (including professional and job ads) on and off LinkedIn. Learn more in our Cookie Policy.

Select Accept to consent or Reject to decline non-essential cookies for this use. You can update your choices at any time in your settings.

Join now Sign in

From the course: Large Language Models on AWS: Building and Deploying Open-Source LLMs

Unlock this course with a free trial

Join today to access over 24,900 courses taught by industry experts.

GGUF file format

GGUF file format

From the course: Large Language Models on AWS: Building and Deploying Open-Source LLMs

Start my 1-month free trial Buy for my team

GGUF file format

“

- [Presenter] This is the GGUF format architecture. something you'll hear a lot when running local models, especially with llama.cpp. So the big picture is that it helps AI models run efficiently. There are three main pieces. There's the original model, there's the GGUF format, and there's llama.cpp. Think of GGUF as a bridge between doing research, let's say someone that was training a model, let's say it's Allen AI, and practical deployment. From a starting point, the models typically are using things like PyTorch or Hugging Face. And these are great for training but not optimized for deployment. They often come with multiple files and dependencies. If we look at the GGUF format in the center here, it combines everything into a single file. So the model weights, this is the tensors, the configuration details, the tokenized information, the architecture metadata, and everything is packaged together so that you can run the model. So this is pretty convenient, because instead of having…

Contents