llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-31 22:53:52 -04:00

Files

Olivier Chafik f13847cfb5 server: fix regression on streamed non-chat completion w/ stops (#13785 )

* more forgiving message diffs: partial stop words aren't erased, full stops are

* Add (slow) server test for completion + stream + stop

2025-05-26 14:16:37 +01:00

test_basic.py

…

test_chat_completion.py

2025-05-25 01:48:08 +01:00

test_completion.py

2025-05-26 14:16:37 +01:00

test_ctx_shift.py

2025-05-16 21:50:00 +02:00

test_embedding.py

…

test_infill.py

…

test_lora.py

…

test_rerank.py

…

test_security.py

…

test_slot_save.py

…

test_speculative.py

…

test_template.py

2025-05-26 00:30:51 +01:00

test_tokenize.py

…

test_tool_call.py

2025-05-25 01:48:08 +01:00

test_vision_api.py

2025-05-23 11:03:47 +02:00