llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-09-06 07:11:25 -04:00

Files

Srihari-mcw baad94885d ggml : Q2k interleaving implementation - x86/x64 SIMD (#14373 )

* Initial Q2_K Block Interleaving Implementation

* Addressed review comments and clean up of the code

* Post rebase fixes

* Initial CI/CD fixes

* Update declarations in arch-fallback.h

* Changes for GEMV Q2_K in arch-fallback.h

* Enable repacking only on AVX-512 machines

* Update comments in repack.cpp

* Address q2k comments

---------

Co-authored-by: Manogna-Sree <elisetti.manognasree@multicorewareinc.com>

2025-08-01 09:20:33 +03:00

cmake

cmake : Fix BLAS link interface (ggml/1316)

2025-07-30 17:33:11 +03:00

include

ggml: Add initial WebGPU backend (#14521 )

2025-07-16 18:18:51 +03:00

src

ggml : Q2k interleaving implementation - x86/x64 SIMD (#14373 )

2025-08-01 09:20:33 +03:00

.gitignore

vulkan : cmake integration (#8119 )

2024-07-13 18:12:39 +02:00

CMakeLists.txt

HIP: add GGML_HIP_MMQ_MFMA option to allow disableing the MFMA path. (#14930 )

2025-07-29 17:44:30 +02:00