llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-28 13:20:27 -04:00

Files

compilade a226bc7a9a gguf-py : support lazy tensor splitting (#12809 )

* gguf-py : support lazy tensor splitting

Splitting usually involves returning tuples of tensors,
which need to be handled properly to avoid early eager evaluation.

* gguf-py : fix flake8 lint

2025-04-08 09:03:07 +02:00

scripts

Refactor gguf scripts to improve metadata handling (#11909 )

2025-02-26 08:04:48 -05:00

__init__.py

convert-*.py: GGUF Naming Convention Refactor and Metadata Override Refactor (#7499 )

2024-07-18 20:40:15 +10:00

constants.py

llama : Support llama 4 text-only (#12791 )

2025-04-07 23:06:44 +02:00

gguf_reader.py

Refactor gguf scripts to improve metadata handling (#11909 )

2025-02-26 08:04:48 -05:00

gguf_writer.py

llama : Support llama 4 text-only (#12791 )

2025-04-07 23:06:44 +02:00

gguf.py

gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981 )

2023-11-11 08:04:50 +03:00

lazy.py

gguf-py : support lazy tensor splitting (#12809 )

2025-04-08 09:03:07 +02:00

metadata.py

convert : fix Norway problem when parsing YAML (#12114 )

2025-02-28 17:44:46 +01:00

py.typed

convert : various script cleanups/fixes + merges and special token handling (#2842 )

2023-08-30 11:25:50 +03:00

quants.py

ggml-quants : ternary packing for TriLMs and BitNet b1.58 (#8151 )

2024-09-05 21:48:47 -04:00

tensor_mapping.py

llama : support BailingMoE (Ling) (#12634 )

2025-03-30 22:21:03 +02:00

utility.py

repo : update links to new url (#11886 )

2025-02-15 16:40:57 +02:00

vocab.py

convert : Support chat_template.json (#12460 )

2025-03-19 08:58:13 +01:00