Logo
Explore Help
Sign In
tqcq/llama.cpp
0
0
Fork 0
You've already forked llama.cpp
mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-18 05:56:00 -04:00
Code Issues Packages Projects Releases Wiki Activity
Files
37ae6a281abda448b9fccb0b27b09e6f5d444ae5
llama.cpp/examples/server/tests/unit
History
Georgi Gerganov a19b5cef16 llama : fix FA when KV cache is not used (i.e. embeddings) (#12825)
* ggml : FA supports F32 V

* graph : cast KV to F16 when the KV cache is not used

ggml-ci

* server : add test that exercises embeddings with FA enabled

ggml-ci
2025-04-08 19:54:51 +03:00
..
test_basic.py
…
test_chat_completion.py
server: fix deadly typo in response_format.json_schema.schema handling (#12168)
2025-03-04 08:24:07 +02:00
test_completion.py
server : Fixed wrong function name in llamacpp server unit test (#11473)
2025-01-29 00:03:42 +01:00
test_ctx_shift.py
…
test_embedding.py
llama : fix FA when KV cache is not used (i.e. embeddings) (#12825)
2025-04-08 19:54:51 +03:00
test_infill.py
server : fix extra BOS in infill endpoint (#11106)
2025-01-06 15:36:08 +02:00
test_lora.py
…
test_rerank.py
server : add TEI API format for /rerank endpoint (#11942)
2025-02-18 14:21:41 +01:00
test_security.py
…
test_slot_save.py
…
test_speculative.py
…
test_tokenize.py
…
test_tool_call.py
tool-call: ensure there's always a non-empty tool call id (#12292)
2025-03-10 09:45:29 +00:00
Powered by Gitea Version: 1.24.5 Page: 2669ms Template: 37ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API