llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-28 13:20:27 -04:00

Files

Jeff Bolz cc98896db8 vulkan: optimize and reenable split_k (#10637 )

Use vector loads when possible in mul_mat_split_k_reduce. Use split_k
when there aren't enough workgroups to fill the shaders.

2024-12-03 20:29:54 +01:00

2024-11-29 21:54:58 +01:00

2024-12-03 20:29:54 +01:00

.gitignore

2024-07-13 18:12:39 +02:00

CMakeLists.txt

2024-12-01 16:12:41 +01:00