llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-26 19:23:37 -04:00

Files

Georgi Gerganov 3637576288 server : disable speculative decoding for SWA models (#13970 )

* server : use swa-full fo draft context

ggml-ci

* server : disable speculative decoding for SWA models

2025-06-02 21:34:40 +03:00

…

…

…

…

…

2025-05-31 15:39:19 -07:00

…

2025-06-02 16:29:28 +02:00

2025-05-08 14:26:50 +03:00

…

…

sync : vendor (#13901 )

2025-05-30 16:25:45 +03:00

2025-06-02 21:34:40 +03:00

…

sync : vendor (#13901 )

2025-05-30 16:25:45 +03:00

CMakeLists.txt

…