43cd2b3eb5
imatrix : support 3d tensors with MUL_MAT
2025-06-23 12:20:55 -04:00
1a9454a3d2
imatrix : avoid returning from void function save_imatrix
2025-06-18 16:44:41 -04:00
ba6f6be6ce
imatrix : don't use FMA explicitly
...
This should make comparisons between the formats easier
because this matches the behavior of the previous version.
2025-06-18 16:33:37 -04:00
2c0945027a
Merge branch 'master' into compilade/imatrix-batched-chunks
2025-06-18 16:32:35 -04:00
745aa5319b
llama : deprecate llama_kv_self_ API ( #14030 )
...
* llama : deprecate llama_kv_self_ API
ggml-ci
* llama : allow llama_memory_(nullptr)
ggml-ci
* memory : add flag for optional data clear in llama_memory_clear
ggml-ci
2025-06-06 14:11:15 +03:00
efb8b47eda
imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation ( #13389 )
...
* Add --parse-special for enabling parsing of special tokens in imatrix calculation
* whitespace
2025-05-09 11:53:58 +02:00
51fb96b1ff
context : remove logits_all flag ( #13284 )
...
* context : remove logits_all flag
ggml-ci
* llama : remove logits_all flag + reorder llama_context_params
ggml-ci
2025-05-08 14:26:50 +03:00
3e959f0976
imatrix: fix oob writes if src1 is not contiguous ( #13286 )
2025-05-04 00:50:37 +02:00
1d36b3670b
llama : move end-user examples to tools directory ( #13249 )
...
* llama : move end-user examples to tools directory
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co >
2025-05-02 20:27:13 +02:00