CUDA: tuned mul_mat_q kernels (#2546) · 25d43e0eb5 - llama.cpp - Cat's Mantra

tqcq/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-09-01 21:04:58 -04:00

CUDA: tuned mul_mat_q kernels (#2546)

This commit is contained in:

Johannes Gäßler

2023-08-09 09:42:34 +02:00

committed by

GitHub

parent f5bfea0580

commit 25d43e0eb5

3 changed files with 676 additions and 386 deletions

1056

ggml-cuda.cu

View File

File diff suppressed because it is too large Load Diff