Default Branch

8846aace49 · model : gemma3n text-only (#14400) · Updated 2025-06-26 17:34:02 +00:00

Branches

8b5ea7ad67 · SYCL: Take improvements from GLU branch and disable faulty fp16 exp after update · Updated 2025-06-26 14:18:58 +00:00

2
1

6179578988 · batch : require non-coupled batch with sequential split_equal · Updated 2025-06-25 14:20:46 +00:00

11
25

2aac8e81b0 · add tests · Updated 2025-06-24 12:02:34 +00:00

3
3

e33de128c7 · common : move string_remove_suffix from quantize and imatrix · Updated 2025-06-23 20:24:06 +00:00

8
27

afdb669206 · Merge branch 'master' into compilade/mamba2 · Updated 2025-06-23 14:40:16 +00:00

8
41

36f8e20d08 · kv-cache : utilize ggml_set_rows broadcast · Updated 2025-06-23 10:22:51 +00:00

11
18

ab46d11de5 · Refactor: Optimize SYCL element-wise operations with unary function inlining · Updated 2025-06-22 13:51:19 +00:00

20
19

ae96333923 · metal : fix thread-safety · Updated 2025-06-20 13:42:54 +00:00

37
1

6fb2f2e8a9 · ggml : fix repack work size for mul_mat_id · Updated 2025-06-20 07:34:16 +00:00

40
1

6201b43814 · Update the graph. · Updated 2025-06-19 15:13:28 +00:00

67
4

ccb2bb9988 · test-model-random : show max error · Updated 2025-06-18 19:11:23 +00:00

67
9

59fee24c72 · recurrent : rework graph inputs + add TODOs · Updated 2025-06-18 06:29:51 +00:00

64
31

d3d06debe3 · server : add pidfile option · Updated 2025-06-17 20:47:53 +00:00

65
1

4b2233befb · Vulkan: Set device max size for host memory to avoid OOM warning and fallback to CPU buffer · Updated 2025-06-17 20:25:42 +00:00

67
1

36fce98281 · server : re-enable swa speculative decoding · Updated 2025-06-12 08:51:15 +00:00

108
1

ed99a8ea04 · cont : fix comments · Updated 2025-06-12 07:43:55 +00:00

111
3

4b6fb6524b · context : round n_tokens to next multiple of n_seqs when reserving · Updated 2025-06-11 20:19:17 +00:00

114
1

62a9f34bae · llama-graph : fix recurrent state copy · Updated 2025-06-10 04:26:30 +00:00

138
3

c257a8871c · cont : fix defrag erasing cells that didn't move · Updated 2025-06-09 17:45:56 +00:00

145
3

ca407742c5 · profiler: initial support for profiling graph ops · Updated 2025-06-05 21:38:13 +00:00

181
1