llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-09-03 05:39:25 -04:00

Files

Olivier Chafik e121edc432 server: add --reasoning-budget 0 to disable thinking (incl. qwen3 w/ enable_thinking:false) (#13771 )

---------

Co-authored-by: ochafik <ochafik@google.com>
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

2025-05-26 00:30:51 +01:00

test_basic.py

…

test_chat_completion.py

2025-05-25 01:48:08 +01:00

test_completion.py

2025-05-14 13:35:07 +02:00

test_ctx_shift.py

2025-05-16 21:50:00 +02:00

test_embedding.py

…

test_infill.py

…

test_lora.py

…

test_rerank.py

…

test_security.py

…

test_slot_save.py

…

test_speculative.py

…

test_template.py

2025-05-26 00:30:51 +01:00

test_tokenize.py

…

test_tool_call.py

2025-05-25 01:48:08 +01:00

test_vision_api.py

2025-05-23 11:03:47 +02:00