Files
llama.cpp/ggml-cuda
Johannes Gäßler 9a590c8226 CUDA: optimize MMQ int8 tensor core performance (#8062)
* CUDA: optimize MMQ int8 tensor core performance

* only a single get_mma_tile_x_k function

* simplify code, make functions constexpr
2024-06-24 12:41:23 +02:00
..
2024-06-05 16:53:00 +02:00
2024-03-29 17:45:46 +02:00
2024-04-30 12:16:08 +03:00
2024-06-05 11:29:20 +03:00
2024-06-17 00:23:04 +02:00
2024-06-17 00:23:04 +02:00