mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-07-09 13:02:12 +00:00
* ggml : fix FA mask dim 2 and 3 ggml-ci * backends : unsupport batched FA in CUDA and Vulkan ggml-ci * vulkan : disable FA for mask->ne[2] != 1