Commit Graph

8 Commits

Author SHA1 Message Date
4c9fdfbe15 ubatch : new splitting logic (#14217)
ggml-ci
2025-06-20 10:14:14 +03:00
d3e64b9f49 llama : rework embeddings logic (#14208)
* llama : rework embeddings logic

ggml-ci

* cont : fix rerank

ggml-ci

* cont : engrish [no ci]

* cont : fix rerank

ggml-ci

* server : support both embeddings and completions with single model

ggml-ci

* cont : avoid embeddings_org

ggml-ci
2025-06-16 14:14:00 +03:00
c3ee46fab4 batch : remove logits_all flag (#14141)
ggml-ci
2025-06-12 11:49:26 +03:00
7ae2932116 kv-cache : add LLAMA_KV_CACHE_DEBUG environment variable (#14121) 2025-06-11 12:52:45 +03:00
745aa5319b llama : deprecate llama_kv_self_ API (#14030)
* llama : deprecate llama_kv_self_ API

ggml-ci

* llama : allow llama_memory_(nullptr)

ggml-ci

* memory : add flag for optional data clear in llama_memory_clear

ggml-ci
2025-06-06 14:11:15 +03:00
7f37b6cf1e memory : migrate from llama_kv_cache to more generic llama_memory (#14006)
* memory : merge llama_kv_cache into llama_memory + new `llama_memory` API

ggml-ci

* context : fix casts

ggml-ci
2025-06-05 15:29:22 +03:00
3e63a58ef7 kv-cache : refactor the update/defrag mechanism (#13988)
* kv-cache : refactor update mechanism

ggml-ci

* memory : improve status handling

* defrag : reset head + add comments

ggml-ci

* cont : minor fixes

ggml-ci
2025-06-04 18:58:20 +03:00
0fc16b42e8 kv-cache : split implementation in separate sources (#13920)
ggml-ci
2025-06-01 11:39:27 +03:00