Georgi Gerganov
29ae62d2ae
llama : fix embeddings (#5796)
* llama : fix embeddings
ggml-ci
* llama : do not use KV cache for non-causal models
ggml-ci
* embeddings : fix llama_batch_init arg
* llama : add pooling switch
* llama : distinguish token vs sequence embeddings
ggml-ci
* llama : assert pooling tensor
* llama : simplify causal mask condition
ggml-ci
* llama : assert input batch with pooling enabled
* readme : update API changes list
2024-03-04 22:31:20 +02:00
..
2024-02-25 12:09:09 +02:00
2024-02-18 18:20:12 +02:00
2024-03-01 13:39:06 +02:00
2024-02-16 11:31:07 +02:00
2024-02-16 11:31:07 +02:00
2024-02-18 18:20:12 +02:00
2024-03-04 22:31:20 +02:00
2024-02-17 23:03:14 +02:00
2024-02-25 12:09:09 +02:00
2024-02-16 11:31:07 +02:00
2024-03-02 12:27:26 -05:00
2024-03-02 19:49:30 +08:00
2024-02-25 20:43:00 +02:00
2024-02-16 11:31:07 +02:00
2024-02-25 12:09:09 +02:00
2024-02-16 11:31:07 +02:00
2024-02-16 11:31:07 +02:00
2024-03-04 09:57:20 +02:00
2024-02-16 11:31:07 +02:00
2024-02-27 14:35:51 +02:00
2024-02-18 22:39:30 +02:00
2024-02-27 16:34:24 +02:00
2024-02-03 13:23:37 +02:00
2024-03-04 22:31:20 +02:00
2024-02-16 11:31:07 +02:00
2024-03-04 20:24:00 +02:00
2024-03-02 19:49:30 +08:00
2024-02-16 11:31:07 +02:00
2024-02-25 12:09:09 +02:00
2024-02-13 19:56:38 +02:00
2024-02-19 16:14:07 +02:00
2024-03-04 22:31:20 +02:00