CUDA: MMQ code deduplication + iquant support (#8495)

* CUDA: MMQ code deduplication + iquant support * 1 less parallel job for CI build
2025-08-14 04:17:53 -04:00 · 2024-07-20 22:25:26 +02:00
parent 07283b1a90
commit 69c487f4ed
11 changed files with 800 additions and 639 deletions
--- a/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_s.cu
+++ b/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_s.cu
@@ -0,0 +1,5 @@
+// This file has been autogenerated by generate_cu_files.py, do not edit manually.
+
+#include "../mmq.cuh"
+
+DECL_MMQ_CASE(GGML_TYPE_IQ3_S);