llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-09-06 15:18:01 -04:00

Files

Pierrick Hymbert 9e359a4f47 server: continue to update other slots on embedding concurrent request (#5699 )

* server: #5655 - continue to update other slots on embedding concurrent request.

* server: tests: add multi users embeddings as fixed

* server: tests: adding OAI compatible embedding concurrent endpoint

* server: tests: adding OAI compatible embedding with multiple inputs

2024-02-24 19:16:04 +01:00

steps

server: continue to update other slots on embedding concurrent request (#5699 )

2024-02-24 19:16:04 +01:00

environment.py

server: init functional tests (#5566 )

2024-02-24 12:28:55 +01:00

issues.feature

server: continue to update other slots on embedding concurrent request (#5699 )

2024-02-24 19:16:04 +01:00

parallel.feature

server: continue to update other slots on embedding concurrent request (#5699 )