llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-19 00:57:41 +00:00

Files

Nigel Bosch eb7cf15a80 server : add /apply-template endpoint for additional use cases of Minja functionality (#11489 )

* add /apply-template endpoint to server

* remove unnecessary line

* add /apply-template documentation

* return only "prompt" field in /apply-template

* use suggested idea instead of my overly verbose way

2025-01-29 19:45:44 +01:00

test_basic.py

server : add flag to disable the web-ui (#10762 ) (#10751 )

2024-12-10 18:22:34 +01:00

test_chat_completion.py

server : add /apply-template endpoint for additional use cases of Minja functionality (#11489 )

2025-01-29 19:45:44 +01:00

test_completion.py

server : Fixed wrong function name in llamacpp server unit test (#11473 )

2025-01-29 00:03:42 +01:00

test_ctx_shift.py

…

test_embedding.py

server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967 )

2024-12-24 21:33:04 +01:00

test_infill.py

server : fix extra BOS in infill endpoint (#11106 )

2025-01-06 15:36:08 +02:00

test_lora.py

server : allow using LoRA adapters per-request (#10994 )

2025-01-02 15:05:18 +01:00

test_rerank.py

server : fill usage info in embeddings and rerank responses (#10852 )

2024-12-17 18:00:24 +02:00

test_security.py

…

test_slot_save.py

…

test_speculative.py

server : allow using LoRA adapters per-request (#10994 )

2025-01-02 15:05:18 +01:00

test_tokenize.py

…