llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-26 11:13:53 -04:00

Files

Xuan Son Nguyen 49122a873f gemma2: add sliding window mask (#8227 )

* gemma2: add sliding window mask

* fix data_swa uninitialized

* better naming

* add co-author

Co-authored-by: Arlo Phoenix <arlo-phoenix@users.noreply.github.com>

* replace list with single tensor

* update

* llama : minor styling

* convert : add sanity check for query_pre_attn_scalar

* fix small typo in README

---------

Co-authored-by: Arlo Phoenix <arlo-phoenix@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

2024-07-01 18:48:34 +02:00

__init__.py

convert-hf : support direct Q8_0 conversion (#7234 )

2024-05-13 14:10:51 -04:00

constants.py

gemma2: add sliding window mask (#8227 )

2024-07-01 18:48:34 +02:00

gguf_reader.py

Gguf dump start data offset via --data-offset and some extra refactor (#8054 )

2024-06-25 22:03:25 +10:00

gguf_writer.py

gemma2: add sliding window mask (#8227 )