graph : make FA compatible with MLA + add initial Metal kernels (#12953)

* graph : make mla compatible with FA

* metal : add exp FA kernels for DeepSeek models

ggml-ci

* llama : minor naming updates

ggml-ci

* ggml : disable FA for DS head sizes

* tests : add FA tests for MLA shapes

ggml-ci
This commit is contained in:
Georgi Gerganov
2025-04-17 18:16:36 +03:00
committed by GitHub
parent 207c22ec2d
commit 2f74c354c0
8 changed files with 117 additions and 26 deletions

View File

@ -9261,6 +9261,7 @@ static bool ggml_backend_vk_device_supports_op(ggml_backend_dev_t dev, const ggm
case 112:
case 128:
case 256:
case 575: // DeepSeek MLA
break;
default:
return false;