llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-06-26 11:45:21 +00:00

master

bf5bcd0b85 · docs: update s390x documentation + add faq (#14389) · Updated 2025-06-26 10:41:41 +00:00

gg/kv-fix-shift c257a8871c · cont : fix defrag erasing cells that didn't move · Updated 2025-06-09 17:45:56 +00:00	145 3	ZIP TAR.GZ
graph-profiler ca407742c5 · profiler: initial support for profiling graph ops · Updated 2025-06-05 21:38:13 +00:00	181 1	ZIP TAR.GZ
cisc/jina-embeddings-v3 3862d954bb · rope · Updated 2025-06-01 19:46:15 +00:00	181 10	ZIP TAR.GZ
maxk/sched-prio-updates ac35e50c16 · Update tools/llama-bench/llama-bench.cpp · Updated 2025-05-31 22:38:37 +00:00	210 3	ZIP TAR.GZ
cisc/test-tokenizers-remote d3a2eb592d · disable on windows · Updated 2025-05-31 21:17:18 +00:00	201 12	ZIP TAR.GZ
gg/min-p-fix 9065ca71a2 · tests : sampling tests use min_keep == 0 · Updated 2025-05-27 08:30:41 +00:00	256 3	ZIP TAR.GZ
gg/tts-fix-ubatch 108d484ab2 · tts : fix n_ubatch + make WavTokenizer cache-less · Updated 2025-05-22 18:58:10 +00:00 tqcq	300 1	ZIP TAR.GZ
jared/permit-causal-encode b06a954bbc · llama_encode : only force non-causal attention for enc-dec models · Updated 2025-05-19 17:43:59 +00:00 tqcq	333 1	ZIP TAR.GZ
gg/bench-handle-decode-errors 8282d74692 · bench : handle decode errors · Updated 2025-05-14 19:36:29 +00:00 tqcq	371 1	ZIP TAR.GZ
gg/server-models-loading 237acc7cd5 · server : update readme + return json for "meta" field · Updated 2025-05-14 12:30:12 +00:00 tqcq	380 2	ZIP TAR.GZ
gg/metal-fa-vec-bs20 78d70223c3 · metal : use FA-vec kernel up to batch size 20 · Updated 2025-05-13 07:38:06 +00:00 tqcq	397 3	ZIP TAR.GZ
refactor-server 5c32fc3d13 · Break down main function in llama-server · Updated 2025-05-10 12:31:48 +00:00 tqcq	423 1	ZIP TAR.GZ
xsn/arg_mmproj_env_var 1cba73458b · small note about -hf --mmproj · Updated 2025-05-09 21:42:54 +00:00 tqcq	426 2	ZIP TAR.GZ
gg/context-remove-logits-all 6107303ab0 · llama : remove logits_all flag + reorder llama_context_params · Updated 2025-05-08 10:01:41 +00:00 tqcq	450 2	ZIP TAR.GZ
xsn/graph_ffn_gate_fix 8681d3ddb3 · Revert "fix build on windows" · Updated 2025-05-06 11:41:55 +00:00 tqcq	469 3	ZIP TAR.GZ
gg/metal-mm-pad 16843dba33 · metal : pad mm results · Updated 2025-05-04 06:13:52 +00:00 tqcq	486 1	ZIP TAR.GZ
jg/llama-opt-3 15dea7bbdf · opt : remove print [no ci] · Updated 2025-05-02 18:25:29 +00:00 tqcq	490 4	ZIP TAR.GZ
sync-ggml-25-05-01 65202d2985 · sync : ggml · Updated 2025-05-01 06:59:02 +00:00 tqcq	521 3	ZIP TAR.GZ
gg/survey-nvidia b710758323 · readme : update hot topics · Updated 2025-04-28 08:04:28 +00:00 tqcq	555 1	ZIP TAR.GZ
cedo/fix-q25vl 37ae6a281a · Fixes Qwen2.5VL segfault during inference with https://github.com/ggml-org/llama.cpp/pull/12402 as has_qwen2vl_merger migration was incomplete · Updated 2025-04-27 10:36:57 +00:00 tqcq	562 1	ZIP TAR.GZ

1 2 3 4 5 ...

Default Branch

Branches