Logo
Explore Help
Sign In
tqcq/llama.cpp
0
0
Fork 0
You've already forked llama.cpp
mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-23 19:25:51 +00:00
Code Issues Packages Projects Releases Wiki Activity
Files
bfd2f21fb43525a8757a8c9e44032fd14bac222b
llama.cpp/gguf-py/gguf
History
Francis Couture-Harpin 0996149911 convert-hf : allow converting the weird BitNet 1.3B
Its FFN size is 5460 which is not convenient.
The offending tensors are kept in F16,
which makes the final model 5.01 bpw.
2024-06-27 02:06:28 -04:00
..
__init__.py
convert-hf : support direct Q8_0 conversion (#7234)
2024-05-13 14:10:51 -04:00
constants.py
ggml-quants : 1.625 bpw ternary packing for BitNet 1.58b
2024-06-27 02:06:22 -04:00
gguf_reader.py
Gguf dump start data offset via --data-offset and some extra refactor (#8054)
2024-06-25 22:03:25 +10:00
gguf_writer.py
Option to split during conversion (#6942)
2024-06-24 19:42:03 +10:00
gguf.py
gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981)
2023-11-11 08:04:50 +03:00
lazy.py
convert-hf : support direct Q8_0 conversion (#7234)
2024-05-13 14:10:51 -04:00
py.typed
convert : various script cleanups/fixes + merges and special token handling (#2842)
2023-08-30 11:25:50 +03:00
quants.py
convert-hf : allow converting the weird BitNet 1.3B
2024-06-27 02:06:28 -04:00
tensor_mapping.py
gguf-py, convert-hf : model conversion support for T5 and FLAN-T5 model variants (#5763)
2024-06-24 07:06:05 +02:00
vocab.py
Move convert.py to examples/convert-legacy-llama.py (#7430)
2024-05-30 21:40:00 +10:00
Powered by Gitea Version: 1.24.2 Page: 442ms Template: 8ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API