llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-17 13:40:55 -04:00

Files

compilade a7366faa5b gguf-py : avoid requiring pyside6 for other scripts (#13036 )

- gguf-py : remove gguf-py/gguf/scripts/__init__.py because it's not needed

Implicit namespaces are supported since Python 3.3 (https://peps.python.org/pep-0420/),
and the entrypoints in pyproject.toml can directly refer to the main functions.

2025-05-05 22:27:31 -04:00

scripts

gguf-py : avoid requiring pyside6 for other scripts (#13036 )

2025-05-05 22:27:31 -04:00

__init__.py

…

constants.py

llama-model : support Qwen2 embedding models and pooling_mode_lasttoken (#13245 )

2025-05-02 11:42:30 -04:00

gguf_reader.py

Refactor gguf scripts to improve metadata handling (#11909 )

2025-02-26 08:04:48 -05:00

gguf_writer.py

convert : converting mmproj for Qwen2/2.5VL from convert_hf_to_gguf (#13209 )

2025-05-02 17:17:15 +02:00

gguf.py

…

lazy.py

gguf-py : support lazy tensor splitting (#12809 )

2025-04-08 09:03:07 +02:00

metadata.py

convert : fix Norway problem when parsing YAML (#12114 )

2025-02-28 17:44:46 +01:00

py.typed

…

quants.py

ggml-quants : ternary packing for TriLMs and BitNet b1.58 (#8151 )

2024-09-05 21:48:47 -04:00

tensor_mapping.py

clip : fix confused naming ffn_up and ffn_down (#13290 )

2025-05-05 12:54:44 +02:00

utility.py

convert : ability to lazy-load safetensors remotely without downloading to disk (#12820 )

2025-04-10 17:24:44 +02:00

vocab.py

convert : Support chat_template.json (#12460 )

2025-03-19 08:58:13 +01:00