scripts: synthetic prompt mode for server-bench.py (#14695)

2025-07-22 02:38:03 +00:00 · 2025-07-16 09:33:28 +02:00
parent 4b91d6f71f
commit 5cae766541
2 changed files with 122 additions and 67 deletions
--- a/tools/server/README.md
+++ b/tools/server/README.md
@ -7,7 +7,7 @@ Set of LLM REST APIs and a simple web front end to interact with llama.cpp.
 **Features:**
 * LLM inference of F16 and quantized models on GPU and CPU
 * [OpenAI API](https://github.com/openai/openai-openapi) compatible chat completions and embeddings routes
- * Reranking endoint (https://github.com/ggml-org/llama.cpp/pull/9510)
+ * Reranking endpoint (https://github.com/ggml-org/llama.cpp/pull/9510)
 * Parallel decoding with multi-user support
 * Continuous batching
 * Multimodal ([documentation](../../docs/multimodal.md)) / with OpenAI-compatible API support