CUDA: optimize and refactor MMQ (#8416)

* CUDA: optimize and refactor MMQ

* explicit q8_1 memory layouts, add documentation
This commit is contained in:
Johannes Gäßler
2024-07-11 16:47:47 +02:00
committed by GitHub
parent a977c11544
commit 808aba3916
5 changed files with 867 additions and 687 deletions

File diff suppressed because it is too large Load Diff