* ggml : add ggml_flash_attn_ext_get_prec
* metal : use F16 precision in FA kernels
ggml-ci
* metal : minor clean-up
* metal : compile-guard bf16 FA kernels
ggml-ci
* build : remove obsolete compile flag [no ci]
* metal : prevent int overflows [no ci]
* cuda : disable BF16 FA
ggml-ci
* metal : fix BF16 requirement for FA kernels
ggml-ci
* make : clean-up [no ci]