llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-22 10:48:12 +00:00

Files

Olivier Chafik cde3833239 tool-call: allow --chat-template chatml w/ --jinja, default to chatml upon parsing issue, avoid double bos (#11616 )

* tool-call: allow `--jinja --chat-template chatml`

* fix double bos issue (drop bos/eos tokens from jinja template)

* add missing try catch around jinja parsing to default to chatml

* Simplify default chatml logic

2025-02-03 23:49:27 +00:00

test_basic.py

…

test_chat_completion.py

tool-call: allow --chat-template chatml w/ --jinja, default to chatml upon parsing issue, avoid double bos (#11616 )

2025-02-03 23:49:27 +00:00

test_completion.py

server : Fixed wrong function name in llamacpp server unit test (#11473 )

2025-01-29 00:03:42 +01:00

test_ctx_shift.py

…

test_embedding.py

server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967 )

2024-12-24 21:33:04 +01:00

test_infill.py

server : fix extra BOS in infill endpoint (#11106 )

2025-01-06 15:36:08 +02:00

test_lora.py

server : allow using LoRA adapters per-request (#10994 )

2025-01-02 15:05:18 +01:00

test_rerank.py

…

test_security.py

…

test_slot_save.py

…

test_speculative.py

server : allow using LoRA adapters per-request (#10994 )

2025-01-02 15:05:18 +01:00

test_tokenize.py

…

test_tool_call.py

tool-call: allow --chat-template chatml w/ --jinja, default to chatml upon parsing issue, avoid double bos (#11616 )

2025-02-03 23:49:27 +00:00