llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-15 20:53:00 -04:00

Files

Srihari-mcw 3d82dbcbce ggml : block interleaving support for Q4_K quantization for x86 AVX2 architecture (#12332 )

* Add block interleaving support for Q4_K quantization

* Remove whitespaces and fix CI/CD issues

* Update pointer of bsums from int16_t to const int16_t

* Add vector version of quantize_q8_K_4x8 function

* Update code formatting based on review comments

2025-03-20 13:35:34 +02:00

cmake

cmake : enable building llama.cpp using system libggml (#12321 )

2025-03-17 11:05:23 +02:00

include

llama: Add support for RWKV v7 architecture (#12412 )

2025-03-18 07:27:50 +08:00

src

ggml : block interleaving support for Q4_K quantization for x86 AVX2 architecture (#12332 )

2025-03-20 13:35:34 +02:00

.gitignore

vulkan : cmake integration (#8119 )

2024-07-13 18:12:39 +02:00

CMakeLists.txt

SYCL: using graphs is configurable by environment variable and compile option (#12371 )

2025-03-18 11:16:31 +01:00