llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-15 20:53:00 -04:00

Files

Georgi Gerganov 0e70ba686e server : add "tokens" output (#10853 )

* server : add "tokens" output

ggml-ci

* server : update readme

ggml-ci

* server : return tokens ids only if requested

ggml-ci

* tests : improve "tokens" type check

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

* server : remove "tokens" from the OAI endpoint

ggml-ci

---------

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

2024-12-18 11:05:29 +02:00

test_basic.py

server : add flag to disable the web-ui (#10762 ) (#10751 )

2024-12-10 18:22:34 +01:00

test_chat_completion.py

server : (refactor) no more json in server_task input (#10691 )

2024-12-07 20:21:09 +01:00

test_completion.py

server : add "tokens" output (#10853 )

2024-12-18 11:05:29 +02:00

test_ctx_shift.py

…