llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-09-03 13:48:51 -04:00

Files

Georgi Gerganov 1da7b76569 server : fix speculative decoding with context shift (#10641 )

* server : fix speculative decoding with context shift

ggml-ci

* server : take into account speculative limits

ggml-ci

* server : add tests

2024-12-04 22:38:20 +02:00

test_basic.py

…

test_chat_completion.py

2024-12-02 14:45:54 +01:00

test_completion.py

…

test_ctx_shift.py

…

test_embedding.py

…

test_infill.py

…

test_lora.py

…

test_rerank.py

…

test_security.py

…

test_slot_save.py

…

test_speculative.py

2024-12-04 22:38:20 +02:00

test_tokenize.py

…