server : add support for embd_normalize parameter (#14964)

This commit adds support for the `embd_normalize` parameter in the server code. The motivation for this is that currently if the server is started with a pooling type that is not `none`, then Euclidean/L2 normalization will be the normalization method used for embeddings. However, this is not always the desired behavior, and users may want to use other normalization (or none) and this commit allows that. Example usage: ```console curl --request POST \ --url http://localhost:8080/embedding \ --header "Content-Type: application/json" \ --data '{"input": "Hello world today", "embd_normalize": -1} ```
2025-08-14 12:19:48 -04:00 · 2025-07-30 18:07:11 +02:00
parent ad4a700117
commit 41e78c567e
2 changed files with 22 additions and 1 deletions
--- a/tools/server/README.md
+++ b/tools/server/README.md
@@ -644,6 +644,15 @@ The same as [the embedding example](../embedding) does.

 `image_data`: An array of objects to hold base64-encoded image `data` and its `id`s to be reference in `content`. You can determine the place of the image in the content as in the following: `Image: [img-21].\nCaption: This is a picture of a house`. In this case, `[img-21]` will be replaced by the embeddings of the image with id `21` in the following `image_data` array: `{..., "image_data": [{"data": "<BASE64_STRING>", "id": 21}]}`. Use `image_data` only with multimodal models, e.g., LLaVA.

+`embd_normalize`: Normalization for pooled embeddings. Can be one of the following values:
+```
+  -1: No normalization
+   0: Max absolute
+   1: Taxicab
+   2: Euclidean/L2
+  >2: P-Norm
+```
+
 ### POST `/reranking`: Rerank documents according to a given query

 Similar to https://jina.ai/reranker/ but might change in the future.