llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-13 11:57:43 -04:00

Files

Akarshan Biswas cd1fce6d4f SYCL: Add set_rows support for quantized types (#14883 )

* SYCL: Add set_rows support for quantized types

This commit adds support for GGML_OP_SET_ROWS operation for various
quantized tensor types (Q8_0, Q5_1, Q5_0, Q4_1, Q4_0, IQ4_NL) and BF16
type in the SYCL backend.

The quantization/dequantization copy kernels were moved from cpy.cpp
to cpy.hpp to make them available for set_rows.cpp.

This addresses part of the TODOs mentioned in the code.

* Use get_global_linear_id() instead

ggml-ci

* Fix formatting

ggml-ci

* Use const for ne11 and size_t variables in set_rows_sycl_q

ggml-ci

* Increase block size for q kernel to 256

ggml-ci

* Cleanup imports

* Add float.h to cpy.hpp

2025-07-28 20:32:15 +05:30

cmake

cmake : Indent ggml-config.cmake (ggml/1310)

2025-07-28 08:15:01 +03:00

include

ggml: Add initial WebGPU backend (#14521 )

2025-07-16 18:18:51 +03:00

src

SYCL: Add set_rows support for quantized types (#14883 )

2025-07-28 20:32:15 +05:30

.gitignore

…

CMakeLists.txt

ggml-cpu : disable GGML_NNPA by default due to instability (#14880 )

2025-07-25 19:09:03 +02:00