Commit Graph

3 Commits

Author SHA1 Message Date
Johannes Gäßler
defe2158dd CUDA: mul_mat_v support for batch sizes > 1 (#14262)
* CUDA: mul_mat_v support for batch sizes > 1

* use 64 bit math for initial offset calculation
2025-06-23 13:11:31 +02:00
Johannes Gäßler
658987cfc9 CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (#13014)
* CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID

* fix logic for RoPE support, CUDA graphs
2025-04-22 21:27:40 +02:00
Johannes Gäßler
c3ea58aca4 CUDA: remove DMMV, consolidate F16 mult mat vec (#10318) 2024-11-17 09:09:55 +01:00