llama.cpp/tools at 7f323a589f8684c0eb722e7309074cb5eac0c8b5 - llama.cpp - Cat's Mantra

tqcq/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-28 13:20:27 -04:00

Files

History

David Huang 7f323a589f Add --no-op-offload to improve -ot pp perf in MoE models like llama4 400B (#13386 )

2025-05-11 14:18:39 +02:00

..

…

cvector-generator

…

…

…

imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation (#13389 )

2025-05-09 11:53:58 +02:00

Add --no-op-offload to improve -ot pp perf in MoE models like llama4 400B (#13386 )

2025-05-11 14:18:39 +02:00

llama : do not crash if there is no CPU backend (#13395 )

2025-05-09 13:02:07 +02:00

Add --no-op-offload to improve -ot pp perf in MoE models like llama4 400B (#13386 )

2025-05-11 14:18:39 +02:00

context : remove logits_all flag (#13284 )

2025-05-08 14:26:50 +03:00

…

llama : do not crash if there is no CPU backend (#13395 )

2025-05-09 13:02:07 +02:00

llama-run: add support for downloading models from ModelScope (#13370 )

2025-05-09 10:25:50 +01:00

server : update docs (#13432 )

2025-05-10 18:44:49 +02:00

…

…

CMakeLists.txt

mtmd : rename llava directory to mtmd (#13311 )

2025-05-05 16:02:55 +02:00