Default Branch

bf5bcd0b85 · docs: update s390x documentation + add faq (#14389) · Updated 2025-06-26 10:41:41 +00:00

Branches

ec65d54b65 · only return mistral-v7-tekken as default template · Updated 2025-06-26 10:16:22 +00:00

1
1

291f1e2810 · metal : add special-case mat-vec mul for ne00 == 4 · Updated 2025-06-26 09:12:51 +00:00

1
1

7b7ecc0109 · metal : handle some edge cases when threadgroup size is not a power of 2 · Updated 2025-06-26 07:20:45 +00:00

1
2

6179578988 · batch : require non-coupled batch with sequential split_equal · Updated 2025-06-25 14:20:46 +00:00

11
25

2aac8e81b0 · add tests · Updated 2025-06-24 12:02:34 +00:00

3
3

e33de128c7 · common : move string_remove_suffix from quantize and imatrix · Updated 2025-06-23 20:24:06 +00:00

8
27

afdb669206 · Merge branch 'master' into compilade/mamba2 · Updated 2025-06-23 14:40:16 +00:00

8
41

36f8e20d08 · kv-cache : utilize ggml_set_rows broadcast · Updated 2025-06-23 10:22:51 +00:00

11
18

ab46d11de5 · Refactor: Optimize SYCL element-wise operations with unary function inlining · Updated 2025-06-22 13:51:19 +00:00

20
19

ae96333923 · metal : fix thread-safety · Updated 2025-06-20 13:42:54 +00:00

37
1

6fb2f2e8a9 · ggml : fix repack work size for mul_mat_id · Updated 2025-06-20 07:34:16 +00:00

40
1

6201b43814 · Update the graph. · Updated 2025-06-19 15:13:28 +00:00

67
4

ccb2bb9988 · test-model-random : show max error · Updated 2025-06-18 19:11:23 +00:00

67
9

59fee24c72 · recurrent : rework graph inputs + add TODOs · Updated 2025-06-18 06:29:51 +00:00

64
31

d3d06debe3 · server : add pidfile option · Updated 2025-06-17 20:47:53 +00:00

65
1

4b2233befb · Vulkan: Set device max size for host memory to avoid OOM warning and fallback to CPU buffer · Updated 2025-06-17 20:25:42 +00:00

67
1

36fce98281 · server : re-enable swa speculative decoding · Updated 2025-06-12 08:51:15 +00:00

108
1

ed99a8ea04 · cont : fix comments · Updated 2025-06-12 07:43:55 +00:00

111
3

4b6fb6524b · context : round n_tokens to next multiple of n_seqs when reserving · Updated 2025-06-11 20:19:17 +00:00

114
1

62a9f34bae · llama-graph : fix recurrent state copy · Updated 2025-06-10 04:26:30 +00:00

138
3