Introduce C-style API (#370)

* Major refactoring - introduce C-style API * Clean up * Add <cassert> * Add <iterator> * Add <algorithm> .... * Fix timing reporting and accumulation * Measure eval time only for single-token calls * Change llama_tokenize return meaning
2025-08-20 06:36:48 -04:00 · 2023-03-22 07:32:36 +02:00
parent da0e9fe90c
commit f5a77a629b
14 changed files with 1954 additions and 1752 deletions
--- a/ggml.h
+++ b/ggml.h
@@ -741,6 +741,13 @@ enum ggml_opt_result ggml_opt(
        struct ggml_opt_params params,
        struct ggml_tensor * f);

+//
+// quantization
+//
+
+size_t ggml_quantize_q4_0(float * src, void * dst, int n, int k, int qk, int64_t * hist);
+size_t ggml_quantize_q4_1(float * src, void * dst, int n, int k, int qk, int64_t * hist);
+
 //
 // system info
 //