llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-01 15:09:32 -04:00

Files

Jeff Bolz 9b169a4d4e vulkan: fix mul_mat_vec failure in backend tests (#12529 )

The OOB calculation could be wrong if the last iteration was during one of
the unrolled loops. Adjust the unrolling counts to avoid this. Add a couple
new backend tests that hit this failure on NVIDIA GPUs.

2025-03-24 07:56:17 +01:00

ggml-blas

…

ggml-cann

[CANN]MUL_MAT optimization (#12382 )

2025-03-15 09:31:08 +08:00

ggml-cpu

ggml : fix quantized cpy op (#12310 )

2025-03-22 16:23:26 +02:00

ggml-cuda

musa: refine compute capability (#12493 )

2025-03-22 10:11:37 +01:00

ggml-hip

HIP: implement FlashAttention via rocWMMA for CDNA and RDNA3+ (#12032 )

2025-03-03 22:10:54 +01:00

ggml-kompute

…

ggml-metal

llama: Add support for RWKV v7 architecture (#12412 )

2025-03-18 07:27:50 +08:00

ggml-musa

cuda : enable CUDA Graph on CUDA Toolkit < 12.x (#12394 )

2025-03-17 20:25:13 +02:00

ggml-opencl

opencl: improve profiling (#12442 )

2025-03-18 12:54:55 -07:00

ggml-rpc

ggml : upgrade init_tensor API to return a ggml_status (#11854 )

2025-02-28 14:41:47 +01:00

ggml-sycl

sycl: cleanup oneDNN related code (#12097 )

2025-03-21 10:15:56 +08:00

ggml-vulkan

vulkan: fix mul_mat_vec failure in backend tests (#12529 )

2025-03-24 07:56:17 +01:00

CMakeLists.txt

[SYCL] Fix build on Windows when ccache enabled (#9954 ) (#9976 )

2025-03-21 14:58:47 +08:00

ggml-alloc.c

ggml : upgrade init_tensor API to return a ggml_status (#11854 )

2025-02-28 14:41:47 +01:00

ggml-backend-impl.h

ggml : upgrade init_tensor API to return a ggml_status (#11854 )

2025-02-28 14:41:47 +01:00

ggml-backend-reg.cpp

ggml-backend : fix backend search path (#12330 )

2025-03-11 14:25:17 +01:00

ggml-backend.cpp

ggml : portability fixes for VS 2017 (#12150 )

2025-03-04 18:53:26 +02:00

ggml-common.h

CUDA: use arch list for compatibility check (#11775 )

2025-02-11 00:17:22 +01:00

ggml-impl.h

MUSA: support ARM64 and enable dp4a .etc (#11843 )

2025-02-21 09:46:23 +02:00

ggml-opt.cpp

…

ggml-quants.c

ggml : portability fixes for VS 2017 (#12150 )

2025-03-04 18:53:26 +02:00

ggml-quants.h

…

ggml-threading.cpp

…

ggml-threading.h

…

ggml.c

llama: Add support for RWKV v7 architecture (#12412 )

2025-03-18 07:27:50 +08:00

gguf.cpp

cmake : add sanitizer flags for llama.cpp (#11279 )

2025-01-18 16:18:15 +02:00