llama.cpp

tqcq/llama.cpp

Fork 0

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-06-28 04:15:21 +00:00

Commit Graph

Author SHA1 Message Date

Author	SHA1	Message	Date
Olivier Chafik	e121edc432	`server`: add `--reasoning-budget 0` to disable thinking (incl. qwen3 w/ enable_thinking:false) (#13771 ) --------- Co-authored-by: ochafik <ochafik@google.com> Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>	2025-05-26 00:30:51 +01:00
Olivier Chafik	d785f9c1fd	server: fix/test add_generation_prompt (#13770 ) Co-authored-by: ochafik <ochafik@google.com>	2025-05-25 10:45:49 +01:00
Olivier Chafik	aa48e373f2	`server`: inject date_string in llama 3.x template + fix date for firefunction v2 (#12802 ) * Inject date_string in llama 3.x + fix for functionary v2 https://github.com/ggml-org/llama.cpp/issues/12729 * move/fix detection of functionary v3.1 before llama 3.x, fix & test their non-tool mode Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * generate more tokens in test_completion_with_required_tool_tiny_fast to avoid truncation --------- Co-authored-by: ochafik <ochafik@google.com> Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2025-05-15 02:39:51 +01:00

Olivier Chafik

e121edc432

server: add --reasoning-budget 0 to disable thinking (incl. qwen3 w/ enable_thinking:false) (#13771 )

---------

Co-authored-by: ochafik <ochafik@google.com>
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

2025-05-26 00:30:51 +01:00

Olivier Chafik

d785f9c1fd

server: fix/test add_generation_prompt (#13770 )

Co-authored-by: ochafik <ochafik@google.com>

2025-05-25 10:45:49 +01:00

Olivier Chafik

aa48e373f2

server: inject date_string in llama 3.x template + fix date for firefunction v2 (#12802 )

* Inject date_string in llama 3.x + fix for functionary v2

https://github.com/ggml-org/llama.cpp/issues/12729

* move/fix detection of functionary v3.1 before llama 3.x, fix & test their non-tool mode

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* generate more tokens in test_completion_with_required_tool_tiny_fast to avoid truncation

---------

Co-authored-by: ochafik <ochafik@google.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

2025-05-15 02:39:51 +01:00

3 Commits