llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-14 04:17:53 -04:00

Files

Pierrick Hymbert 9e359a4f47 server: continue to update other slots on embedding concurrent request (#5699 )

* server: #5655 - continue to update other slots on embedding concurrent request.

* server: tests: add multi users embeddings as fixed

* server: tests: adding OAI compatible embedding concurrent endpoint

* server: tests: adding OAI compatible embedding with multiple inputs

2024-02-24 19:16:04 +01:00

steps.py

server: continue to update other slots on embedding concurrent request (#5699 )

2024-02-24 19:16:04 +01:00