llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-18 05:56:00 -04:00

Files

Johannes Gäßler cb5fad4c6c CUDA: refactor and optimize IQ MMVQ (#8215 )

* CUDA: refactor and optimize IQ MMVQ

* uint -> uint32_t

* __dp4a -> ggml_cuda_dp4a

* remove MIN_CC_DP4A checks

* change default

* try CI fix

2024-07-01 20:39:06 +02:00

ggml-cuda

CUDA: refactor and optimize IQ MMVQ (#8215 )

2024-07-01 20:39:06 +02:00

ggml-sycl

CUDA: refactor and optimize IQ MMVQ (#8215 )

2024-07-01 20:39:06 +02:00

kompute @ 4565194ed7

…

kompute-shaders

…

vulkan-shaders

…

CMakeLists.txt

…

ggml-alloc.c

…

ggml-backend-impl.h

…

ggml-backend.c

…

ggml-blas.cpp

…

ggml-common.h

CUDA: refactor and optimize IQ MMVQ (#8215 )

2024-07-01 20:39:06 +02:00

ggml-cuda.cu

CUDA: refactor and optimize IQ MMVQ (#8215 )

2024-07-01 20:39:06 +02:00

ggml-impl.h

…

ggml-kompute.cpp

…

ggml-metal.m

…

ggml-metal.metal

…

ggml-quants.c

…

ggml-quants.h

…

ggml-rpc.cpp

…

ggml-sycl.cpp

[SYCL] Update SYCL-Rope op and Refactor (#8157 )

2024-07-01 19:39:06 +08:00

ggml-vulkan-shaders.hpp

…

ggml-vulkan.cpp

…

ggml.c

…

sgemm.cpp

…

sgemm.h

…