llama.cpp

tqcq/llama.cpp

Fork 0

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-06 17:13:34 -04:00

Files

History

Francis Couture-Harpin dd3e62a703 ggml : add some informative comments in q1_3 vec_dot

2024-07-28 21:17:16 -04:00

ggml-cuda

…

ggml-sycl

…

kompute @ 4565194ed7

…

kompute-shaders

…

vulkan-shaders

…

CMakeLists.txt

ggml : add GGML_CUDA_USE_GRAPHS option, restore GGML_CUDA_FORCE_CUBLAS (cmake) (#8140 )

2024-06-26 21:34:14 +02:00

ggml-alloc.c

…

ggml-backend-impl.h

…

ggml-backend.c

…

ggml-blas.cpp

…

ggml-common.h

bitnet : replace 1.58b with b1.58, as in the paper

2024-06-28 20:38:12 -04:00

ggml-cuda.cu

…

ggml-impl.h

ggml-quants : attempt to fix Arm 32-bit support

2024-06-28 22:52:57 -04:00

ggml-kompute.cpp

…

ggml-metal.m

…

ggml-metal.metal

…

ggml-quants.c

ggml : add some informative comments in q1_3 vec_dot

2024-07-28 21:17:16 -04:00

ggml-quants.h

ggml-quants : 1.625 bpw ternary packing for BitNet 1.58b

2024-06-27 02:06:22 -04:00

ggml-rpc.cpp

…

ggml-sycl.cpp

…

ggml-vulkan-shaders.hpp

…

ggml-vulkan.cpp

…

ggml.c

ggml-quants : ARM NEON vec_dot for q2_2 and q1_3

2024-06-27 02:06:28 -04:00

sgemm.cpp

…

sgemm.h

…