llama.cpp/unit at a457551332853ef19d0796fec12b62c538126ea5 - llama.cpp - Cat's Mantra

tqcq/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-19 00:57:41 +00:00

Files

History

Sigbjørn Skjæret ddef99522d server : fix assistant prefilling when content is an array (#14360 )

2025-07-05 09:17:14 +02:00

..

test_basic.py

…

test_chat_completion.py

server : fix assistant prefilling when content is an array (#14360 )

2025-07-05 09:17:14 +02:00

test_completion.py

server: fix regression on streamed non-chat completion w/ stops (#13785 )

2025-05-26 14:16:37 +01:00

test_ctx_shift.py

server : do not return error out of context (with ctx shift disabled) (#13577 )

2025-05-16 21:50:00 +02:00

test_embedding.py

…

test_infill.py

…

test_lora.py

…

test_rerank.py

…

test_security.py

…

test_slot_save.py

…

test_speculative.py

…

test_template.py

server: add --reasoning-budget 0 to disable thinking (incl. qwen3 w/ enable_thinking:false) (#13771 )

2025-05-26 00:30:51 +01:00

test_tokenize.py

…

test_tool_call.py

server: update deepseek reasoning format (pass reasoning_content as diffs) (#13933 )

2025-06-02 10:15:44 -07:00

test_vision_api.py

server : support audio input (#13714 )

2025-05-23 11:03:47 +02:00