server: init functional tests (#5566)

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-30 14:13:57 -04:00

* server: tests: init scenarios
 - health and slots endpoints
 - completion endpoint
 - OAI compatible chat completion requests w/ and without streaming
 - completion multi users scenario
 - multi users scenario on OAI compatible endpoint with streaming
 - multi users with total number of tokens to predict exceeds the KV Cache size
 - server wrong usage scenario, like in Infinite loop of "context shift" #3969
 - slots shifting
 - continuous batching
 - embeddings endpoint
 - multi users embedding endpoint: Segmentation fault #5655
 - OpenAI-compatible embeddings API
 - tokenize endpoint
 - CORS and api key scenario

* server: CI GitHub workflow


---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

This commit is contained in:

Pierrick Hymbert

2024-02-24 12:28:55 +01:00

committed by

GitHub

parent fd43d66f46

commit 525213d2f5

14 changed files with 1243 additions and 18 deletions

3

examples/server/tests/requirements.txt Normal file

View File

@@ -0,0 +1,3 @@
 aiohttp~=3.9.3
 behave~=1.2.6
 openai~=0.25.0

server: init functional tests (#5566)

3 examples/server/tests/requirements.txt Normal file Unescape Escape View File

3

examples/server/tests/requirements.txt Normal file

View File