Default Branch

5edf1592fd · vulkan : fix out-of-bounds access in argmax kernel (#15342) · Updated 2025-08-15 10:16:36 -04:00

Branches

497474b135 · sched : copy only the used experts when offloading prompt processing · Updated 2025-08-15 09:57:51 -04:00

4
1

395457232b · vulkan : fix out-of-bounds access in argmax kernel · Updated 2025-08-15 05:34:56 -04:00

5
1

afa74511cd · vulkan : fix compile warnings on macos · Updated 2025-08-15 04:35:20 -04:00

6
1

1ae6ab7601 · Merge branch 'master' into compilade/convert-prequant · Updated 2025-08-14 17:05:21 -04:00

107
1

220860aa0c · graph : use F32 accumulators for gpt-oss · Updated 2025-08-14 09:08:31 -04:00

10
1

dbf0433f47 · typo-- · Updated 2025-08-13 09:34:50 -04:00

24
16

d9b625edb6 · ggml-quants : handle imatrix for MXFP4 · Updated 2025-08-11 22:12:10 -04:00

36
1

94d8042eb8 · fix · Updated 2025-08-11 09:25:33 -04:00

39
2

2763dc8b53 · ggml-quants : handle zero amax for MXFP4 · Updated 2025-08-06 16:26:25 -04:00

18
2

ea5e55d03e · Merge branch 'master' into compilade/imatrix-neutral-prior · Updated 2025-08-05 13:34:40 -04:00

20
4

2ec70c964b · tests: Fix OPT_STEP_SGD test-backend-ops · Updated 2025-08-05 00:57:14 -04:00

90
3

145401c9e3 · context : fix logits size overflow for huge batches · Updated 2025-08-04 22:26:46 -04:00

25
2

342e7014db · imatrix : only warn about suffix when output format is unspecified · Updated 2025-08-04 15:12:27 -04:00

30
2

32585e7c98 · vulkan: tune mul_mat_vecq performance for Intel · Updated 2025-08-03 09:49:56 -04:00

51
1

e549515cb3 · memory : handle kv_unified for hybrid models · Updated 2025-08-03 00:45:47 -04:00

39
1

91e67b8583 · imatrix : fix 3d tensor counts · Updated 2025-07-31 11:56:38 -04:00

7
4

b98f80a6b4 · server : test alternative LRU logic · Updated 2025-07-29 14:19:21 -04:00

28
1

0591b39e48 · ops: add MUSA · Updated 2025-07-29 05:25:32 -04:00

34
1

381879e0ac · cont : tmp · Updated 2025-07-29 00:42:55 -04:00

58
3

fb371c18ec · bench,common : add CPU extra buffer types · Updated 2025-07-28 14:53:18 -04:00

35
1