mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-03 07:47:40 -04:00

Files

stduhpf e0324285a5 speculative : threading options (#4959 )

* speculative: expose draft threading

* fix usage format

* accept -td and -tbd args

* speculative: revert default behavior when -td is unspecified

* fix trailing whitespace

2024-01-16 13:04:32 +02:00

CMakeLists.txt

build : link against build info instead of compiling against it (#3879 )

2023-11-02 08:50:16 +02:00

README.md

english : use typos to fix comments and logs (#4354 )

2023-12-12 11:53:36 +02:00

speculative.cpp

speculative : threading options (#4959 )

2024-01-16 13:04:32 +02:00

README.md

llama.cpp/examples/speculative

Demonstration of speculative decoding and tree-based speculative decoding techniques

More info:

https://github.com/ggerganov/llama.cpp/pull/2926
https://github.com/ggerganov/llama.cpp/pull/3624