llama : Support llama 4 text-only (#12791)

* llama4 conversion * initial support, no chat template * clean up a bit * fix tokenizer conversion * correct hparams * try this * fix shexp * ffn_inp_normed * chat template * clean up model conversion * add_bos * add scale_before_ffn * fix order * weight_before_ffn * llm_graph_input_attn_temp * add chunk attn mask * build_inp_attn_scale() * add comment about ggml_repeat * clarify comments * fix build
2025-08-11 19:11:32 -04:00 · 2025-04-07 23:06:44 +02:00
parent 82974011f3
commit 1466621e73
17 changed files with 532 additions and 22 deletions
--- a/include/llama.h
+++ b/include/llama.h
@@ -110,6 +110,7 @@ extern "C" {
        LLAMA_VOCAB_PRE_TYPE_SUPERBPE       = 30,
        LLAMA_VOCAB_PRE_TYPE_TRILLION       = 31,
        LLAMA_VOCAB_PRE_TYPE_BAILINGMOE     = 32,
+        LLAMA_VOCAB_PRE_TYPE_LLAMA4         = 33,
    };

    enum llama_rope_type {