llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-06-29 12:35:16 +00:00

Author	SHA1	Message	Date
Francis Couture-Harpin	43cd2b3eb5	imatrix : support 3d tensors with MUL_MAT	2025-06-23 12:20:55 -04:00
Francis Couture-Harpin	1a9454a3d2	imatrix : avoid returning from void function save_imatrix	2025-06-18 16:44:41 -04:00
Francis Couture-Harpin	ba6f6be6ce	imatrix : don't use FMA explicitly This should make comparisons between the formats easier because this matches the behavior of the previous version.	2025-06-18 16:33:37 -04:00
Francis Couture-Harpin	2c0945027a	Merge branch 'master' into compilade/imatrix-batched-chunks	2025-06-18 16:32:35 -04:00
Georgi Gerganov	745aa5319b	llama : deprecate llama_kv_self_ API (#14030 ) * llama : deprecate llama_kv_self_ API ggml-ci * llama : allow llama_memory_(nullptr) ggml-ci * memory : add flag for optional data clear in llama_memory_clear ggml-ci	2025-06-06 14:11:15 +03:00
Bartowski	efb8b47eda	imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation (#13389 ) * Add --parse-special for enabling parsing of special tokens in imatrix calculation * whitespace	2025-05-09 11:53:58 +02:00
Georgi Gerganov	51fb96b1ff	context : remove logits_all flag (#13284 ) * context : remove logits_all flag ggml-ci * llama : remove logits_all flag + reorder llama_context_params ggml-ci	2025-05-08 14:26:50 +03:00
Johannes Gäßler	3e959f0976	imatrix: fix oob writes if src1 is not contiguous (#13286 )	2025-05-04 00:50:37 +02:00
Diego Devesa	1d36b3670b	llama : move end-user examples to tools directory (#13249 ) * llama : move end-user examples to tools directory --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co>	2025-05-02 20:27:13 +02:00

9 Commits