llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-27 10:38:56 -04:00

Files

cmdr2 f54a4ba11e Support pure float16 add/sub/mul/div operations in the CUDA (and CPU) backend (ggml/1121)

* Support float16-to-float16 add/sub/mul/div operations in the CUDA backend

* Add fp16 support for add/sub/mul/div on the CPU backend

* Add test cases for fp16 add/sub/mul/div

2025-03-03 18:18:11 +02:00

ggml-blas

ggml : add support for dynamic loading of backends (#10469 )

2024-11-25 15:13:39 +01:00

ggml-cann

ggml : upgrade init_tensor API to return a ggml_status (#11854 )

2025-02-28 14:41:47 +01:00

ggml-cpu

Support pure float16 add/sub/mul/div operations in the CUDA (and CPU) backend (ggml/1121)

2025-03-03 18:18:11 +02:00

ggml-cuda

Support pure float16 add/sub/mul/div operations in the CUDA (and CPU) backend (ggml/1121)

2025-03-03 18:18:11 +02:00

ggml-hip

CUDA: app option to compile without FlashAttention (#12025 )

2025-02-22 20:44:34 +01:00

ggml-kompute

llama : add Qwen2VL support + multimodal RoPE (#10361 )

2024-12-14 14:43:46 +02:00

ggml-metal

metal : copy kernels for quant to F32/F16 conversions (#12017 )

2025-02-25 11:27:58 +02:00

ggml-musa

CUDA: app option to compile without FlashAttention (#12025 )

2025-02-22 20:44:34 +01:00

ggml-opencl

ggml : upgrade init_tensor API to return a ggml_status (#11854 )

2025-02-28 14:41:47 +01:00

ggml-rpc

ggml : upgrade init_tensor API to return a ggml_status (#11854 )

2025-02-28 14:41:47 +01:00

ggml-sycl

SYCL: Move CPY kernels to a separate file and add few missing kernels (#12133 )

2025-03-03 11:07:22 +01:00

ggml-vulkan

ggml : upgrade init_tensor API to return a ggml_status (#11854 )

2025-02-28 14:41:47 +01:00

CMakeLists.txt

ci: use sccache on windows instead of ccache (#11545 )

2025-01-31 17:12:40 +00:00

ggml-alloc.c

ggml : upgrade init_tensor API to return a ggml_status (#11854 )

2025-02-28 14:41:47 +01:00

ggml-backend-impl.h

ggml : upgrade init_tensor API to return a ggml_status (#11854 )

2025-02-28 14:41:47 +01:00

ggml-backend-reg.cpp

ggml-backend : keep paths in native string type when possible (#12144 )

2025-03-02 22:11:00 +01:00

ggml-backend.cpp

ggml : upgrade init_tensor API to return a ggml_status (#11854 )

2025-02-28 14:41:47 +01:00

ggml-common.h

CUDA: use arch list for compatibility check (#11775 )

2025-02-11 00:17:22 +01:00

ggml-impl.h

MUSA: support ARM64 and enable dp4a .etc (#11843 )

2025-02-21 09:46:23 +02:00

ggml-opt.cpp

ggml-opt: fix data corruption (ggml/1022)

2024-11-21 09:22:02 +02:00

ggml-quants.c

ggml : refactor online repacking (#10446 )

2024-12-07 14:37:50 +02:00

ggml-quants.h

ggml : build backends as libraries (#10256 )

2024-11-14 18:04:35 +01:00

ggml-threading.cpp

ggml : build backends as libraries (#10256 )

2024-11-14 18:04:35 +01:00

ggml-threading.h

remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797 )

2024-12-12 19:02:49 +01:00

ggml.c

ggml-cpu: Support s390x SIMD Instruction Set (#12019 )

2025-02-22 21:39:24 +00:00

gguf.cpp

cmake : add sanitizer flags for llama.cpp (#11279 )

2025-01-18 16:18:15 +02:00