llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-18 05:56:00 -04:00

Files

Diego Devesa a394039db0 ggml-cpu : add chunking support to mul_mat_id (#11666 )

* ggml-cpu : add chunking support to mul_mat_id

* allocate chunk counter in wdata
parallelize src1 quantization by column to allows parallelization even when there is only one row

* disable for arm

* cleanup

* better way to disable for arm

* fix uninitialized counter when using 1 thread only

* revert test-backend-ops changes

2025-02-13 01:02:38 +01:00

cmake

cmake: add ggml find package (#11369 )

2025-01-26 12:07:48 -04:00

include

cleanup: fix compile warnings associated with gnu_printf (#11811 )

2025-02-12 10:06:53 -04:00

src

ggml-cpu : add chunking support to mul_mat_id (#11666 )

2025-02-13 01:02:38 +01:00

.gitignore

vulkan : cmake integration (#8119 )

2024-07-13 18:12:39 +02:00

CMakeLists.txt

cmake: Add ability to pass in GGML_BUILD_NUMBER (ggml/1096)

2025-02-04 12:59:15 +02:00