llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-22 10:48:12 +00:00

Files

Georgi Gerganov 921772104b speculative : add grammar support (#2991 )

* speculative : add grammar support

* grammars : add json_arr.gbnf

* grammar : add comments to new grammar file

* grammar : remove one nested level

* common : warm-up with 2 tokens - seems to work better

* speculative : print draft token pieces

* speculative : reuse grammar parser + better logs and comments

* speculative : avoid grammar_mem

* make : fix speculative build

2023-09-05 08:46:17 +03:00

CMakeLists.txt

speculative : PoC for speeding-up inference via speculative sampling (#2926 )

2023-09-03 15:12:08 +03:00

speculative.cpp

speculative : add grammar support (#2991 )

2023-09-05 08:46:17 +03:00