CUDA: refactor and optimize IQ MMVQ (#8215)

* CUDA: refactor and optimize IQ MMVQ

* uint -> uint32_t

* __dp4a -> ggml_cuda_dp4a

* remove MIN_CC_DP4A checks

* change default

* try CI fix
This commit is contained in:
Johannes Gäßler
2024-07-01 20:39:06 +02:00
committed by GitHub
parent dae57a1ebc
commit cb5fad4c6c
8 changed files with 406 additions and 487 deletions

File diff suppressed because it is too large Load Diff