llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-28 13:20:27 -04:00

Files

Georgi Gerganov 3637576288 server : disable speculative decoding for SWA models (#13970 )

* server : use swa-full fo draft context

ggml-ci

* server : disable speculative decoding for SWA models

2025-06-02 21:34:40 +03:00

2025-05-13 18:01:53 +03:00

…

…

…

2025-05-09 11:53:58 +02:00

2025-05-31 15:39:19 -07:00

2025-05-09 13:02:07 +02:00

2025-06-02 16:29:28 +02:00

2025-05-08 14:26:50 +03:00

2025-05-13 19:12:31 +02:00

2025-05-25 15:35:53 +03:00

sync : vendor (#13901 )

2025-05-30 16:25:45 +03:00

2025-06-02 21:34:40 +03:00

…

sync : vendor (#13901 )

2025-05-30 16:25:45 +03:00

CMakeLists.txt

2025-05-05 16:02:55 +02:00