mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-07-04 18:16:58 +00:00
tests : add test-tokenizer-0.sh + fix some tokenizers (#7036)
* tests : add test-tokenizer-0.sh * unicode : add all unicode number ranges * starcoder : fix pre-tokenizer * tests : add test that fails with DeepSeek tokenizers * falcon : fix regex * unicode : regenerate unicode tables * refact : add tokenizer model * lint : fix * tests : disable failing tests ggml-ci * refact : add tests files ggml-ci * convert : print -> logging ggml-ci * lint : fix * unicode : digit -> number * phi-3 : update
This commit is contained in:
@ -1,3 +1,5 @@
|
||||
1050 207 19 207 19192 4217
|
||||
37 32009 71 6247
|
||||
|
||||
207
|
||||
243
|
||||
|
Reference in New Issue
Block a user