ggml : introduce structs for the q4 data blocks (#356)

* Introduce structs for the q4 data blocks * ggml : rename quant struct variables + fix ARM_NEON --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-08-29 11:39:14 -04:00 · 2023-03-28 15:56:03 +00:00
parent e0670260fb
commit c1f885067c
6 changed files with 150 additions and 235 deletions
--- a/llama.h
+++ b/llama.h
@@ -81,8 +81,7 @@ extern "C" {
    LLAMA_API int llama_model_quantize(
            const char * fname_inp,
            const char * fname_out,
-                   int   itype,
-                   int   qk);
+                   int   itype);

    // Run the llama inference to obtain the logits and probabilities for the next token.
    // tokens + n_tokens is the provided batch of new tokens to process