vocab : prevent tokenizer overflow (#14301)

* vocab : prevent stack overflow in tokenize * vocab : return error instead of aborting on oversized token count * vocab : INT32_MIN from llama_tokenize on overflow
2025-06-26 11:45:21 +00:00 · 2025-06-20 22:13:06 +08:00
parent 8308f98c7f
commit dd6e6d0b6a
3 changed files with 9 additions and 0 deletions
--- a/include/llama.h
+++ b/include/llama.h
@ -1088,6 +1088,7 @@ extern "C" {
    /// @param tokens The tokens pointer must be large enough to hold the resulting tokens.
    /// @return Returns the number of tokens on success, no more than n_tokens_max
    /// @return Returns a negative number on failure - the number of tokens that would have been returned
+    /// @return Returns INT32_MIN on overflow (e.g., tokenization result size exceeds int32_t limit)
    /// @param add_special Allow to add BOS and EOS tokens if model is configured to do so.
    /// @param parse_special Allow tokenizing special and/or control tokens which otherwise are not exposed and treated
    ///                      as plaintext. Does not insert a leading space.