llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-17 13:40:55 -04:00

Files

Jeffrey Morgan b5e95468b1 llama : add support for llama 3.1 rope scaling factors (#8676 )

* Add llama 3.1 rope scaling factors to llama conversion and inference

This commit generates the rope factors on conversion and adds them to the resulting model as a tensor. At inference time, these factors are passed to the `ggml_rope_ext` rope oepration, improving results for context windows above 8192

* Update convert_hf_to_gguf.py

Co-authored-by: compilade <git@compilade.net>

* address comments

* address comments

* Update src/llama.cpp

Co-authored-by: compilade <git@compilade.net>

* Update convert_hf_to_gguf.py

Co-authored-by: compilade <git@compilade.net>

---------

Co-authored-by: compilade <git@compilade.net>

2024-07-27 15:03:45 +03:00

CMakeLists.txt

llama : move vocab, grammar and sampling into separate files (#8508 )

2024-07-23 13:10:17 +03:00

llama-grammar.cpp

ggml : reduce hash table reset cost (#8698 )

2024-07-27 04:41:55 +02:00

llama-grammar.h

llama : fix build + fix fabs compile warnings (#8683 )

2024-07-25 19:57:31 +03:00

llama-impl.h

llama : move vocab, grammar and sampling into separate files (#8508 )