mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-06-27 03:55:20 +00:00
Add more tokenizer tests (#3742)
* Add more tokenizer tests * Add starcoder * Update test vocab files * Restrict bpe tokenizer tests to unicode planes * Update comment * Comment cosmetics * Remove bloom vocab/test
This commit is contained in:
BIN
models/ggml-vocab-baichuan.gguf
Normal file
BIN
models/ggml-vocab-baichuan.gguf
Normal file
Binary file not shown.
BIN
models/ggml-vocab-gpt-neox.gguf
Normal file
BIN
models/ggml-vocab-gpt-neox.gguf
Normal file
Binary file not shown.
BIN
models/ggml-vocab-refact.gguf
Normal file
BIN
models/ggml-vocab-refact.gguf
Normal file
Binary file not shown.
BIN
models/ggml-vocab-starcoder.gguf
Normal file
BIN
models/ggml-vocab-starcoder.gguf
Normal file
Binary file not shown.
Reference in New Issue
Block a user