llama.cpp/gguf at a75cb30dc9e63488c3614e2d5a9fe2306eaf47cd - llama.cpp - Cat's Mantra

tqcq/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-28 13:20:27 -04:00

Files

History

Jared Van Bortel 2f567611c0 llama-model : support Qwen2 embedding models and pooling_mode_lasttoken (#13245 )

2025-05-02 11:42:30 -04:00

..

gguf-py : GGUF Editor GUI - Python + Qt6 (#12930 )

2025-04-18 20:30:41 +02:00

__init__.py

convert-*.py: GGUF Naming Convention Refactor and Metadata Override Refactor (#7499 )

2024-07-18 20:40:15 +10:00

constants.py

llama-model : support Qwen2 embedding models and pooling_mode_lasttoken (#13245 )

2025-05-02 11:42:30 -04:00

gguf_reader.py

Refactor gguf scripts to improve metadata handling (#11909 )

2025-02-26 08:04:48 -05:00

gguf_writer.py

convert : converting mmproj for Qwen2/2.5VL from convert_hf_to_gguf (#13209 )

2025-05-02 17:17:15 +02:00

gguf.py

…

lazy.py

gguf-py : support lazy tensor splitting (#12809 )

2025-04-08 09:03:07 +02:00

metadata.py

convert : fix Norway problem when parsing YAML (#12114 )

2025-02-28 17:44:46 +01:00

py.typed

…

quants.py

ggml-quants : ternary packing for TriLMs and BitNet b1.58 (#8151 )

2024-09-05 21:48:47 -04:00

tensor_mapping.py

convert : converting mmproj for Qwen2/2.5VL from convert_hf_to_gguf (#13209 )

2025-05-02 17:17:15 +02:00

utility.py

convert : ability to lazy-load safetensors remotely without downloading to disk (#12820 )

2025-04-10 17:24:44 +02:00

vocab.py

convert : Support chat_template.json (#12460 )

2025-03-19 08:58:13 +01:00