Skip to content

Releases: ggml-org/llama.cpp

b7388

13 Dec 23:46
4ed2bae

Choose a tag to compare

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

server-models.cpp: add missing (#18000)

Fixes: #17999

macOS/iOS:

Linux:

Windows:

openEuler:

b7387

13 Dec 21:33
5266379

Choose a tag to compare

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

llama_context: synchronize before reallocating output buffer (#17974)

macOS/iOS:

Linux:

Windows:

openEuler:

b7386

13 Dec 21:25
4d5ae24

Choose a tag to compare

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

arg: fix common_params_parse not accepting negated arg (#17991)

macOS/iOS:

Linux:

Windows:

openEuler:

b7385

13 Dec 20:54
66ba512

Choose a tag to compare

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

cmake: correct scope - link ws2_32 for MinGW/w64devkit builds in cpp-httplib (#17972)

  • fix - w64devkit build

  • fix - w64devkit build private scope

macOS/iOS:

Linux:

Windows:

openEuler:

b7384

13 Dec 19:59
36255a2

Choose a tag to compare

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

vulkan: support get_rows for i32 (#17941)

macOS/iOS:

Linux:

Windows:

openEuler:

b7383

13 Dec 18:56
3229a23

Choose a tag to compare

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

vulkan: support GGML_OP_DIAG (#17893)

macOS/iOS:

Linux:

Windows:

openEuler:

b7382

13 Dec 18:37
303f861

Choose a tag to compare

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

vulkan: Multi-pass softmax for large number of cols (#17892)

When the number of cols is large, split each row across multiple workgroups.
There are three phases that communicate partial results through temp buffers:
(1) compute max partials
(2) take max of partials, compute sum(exp(x-max)) partials
(3) sum partials, compute scaled result

macOS/iOS:

Linux:

Windows:

openEuler:

b7381

13 Dec 17:15
3c6391e

Choose a tag to compare

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

speculative-simple : free batch on exit (#17985)

macOS/iOS:

Linux:

Windows:

openEuler:

b7380

13 Dec 16:40
8e4d678

Choose a tag to compare

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

common : skip model validation when --completion-bash is requested (#17975)

macOS/iOS:

Linux:

Windows:

openEuler:

b7379

13 Dec 15:38
07a10c1

Choose a tag to compare

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

vulkan: Allow non-pow2 n_experts in topk_moe (#17872)

macOS/iOS:

Linux:

Windows:

openEuler: