llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-26 11:13:53 -04:00

Files

pculliton e57dc62057 llama: Add support for Gemma2ForCausalLM (#8156 )

* Inference support for Gemma 2 model family

* Update convert-hf-to-gguf.py, constants, and tensor mappings

* cleanup

* format fix

* Fix special token vocab bug

* Don't add space prefix

* fix deleted lines

* Update src/llama.cpp

Co-authored-by: slaren <slarengh@gmail.com>

* Add model type names

* Add control vector

* Fix model type identification

---------

Co-authored-by: Andrei Betlen <abetlen@gmail.com>
Co-authored-by: slaren <slarengh@gmail.com>

2024-06-27 21:00:43 -07:00

__init__.py

convert-hf : support direct Q8_0 conversion (#7234 )

2024-05-13 14:10:51 -04:00

constants.py

llama: Add support for Gemma2ForCausalLM (#8156 )

2024-06-27 21:00:43 -07:00

gguf_reader.py

Gguf dump start data offset via --data-offset and some extra refactor (#8054 )

2024-06-25 22:03:25 +10:00

gguf_writer.py

Option to split during conversion (#6942 )