llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-26 10:09:41 -04:00

Files

Jeff Bolz a0374a67e2 vulkan: Handle updated FA dim2/3 definition (#14518 )

* vulkan: Handle updated FA dim2/3 definition

Pack mask boolean and n_head_log2 into a single dword to keep the push
constant block under the 128B limit.

* handle null mask for gqa

* allow gqa with dim3>1

2025-07-05 09:26:04 +02:00

cmake

ggml-cpu : rework weak alias on apple targets (#14146 )

2025-06-16 13:54:15 +08:00

include

ggml : implement GEGLU_ERF and GEGLU_QUICK ops (#14445 )

2025-07-03 23:07:22 +02:00

src

vulkan: Handle updated FA dim2/3 definition (#14518 )

2025-07-05 09:26:04 +02:00

.gitignore

vulkan : cmake integration (#8119 )

2024-07-13 18:12:39 +02:00

CMakeLists.txt

ggml : remove kompute backend (#14501 )

2025-07-03 07:48:32 +03:00