llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-28 21:23:55 -04:00

Files

hxer7963 069574775c [Model] Add support for xverse (#6301 )

* Support xverse model convert to gguf format.

* 1. Convert xverse models to gguf;
2. Add LLM_ARCH_XVERSE inference in llama.cpp;
3. Add xverse item in Supported models in README.md;

* * gguf-py: remove redundant logs
* llama: remove the init_mapping_prefetch custom parameter

* llama.cpp: Include the changes from #6122 to exclude the unused outputs of the last layers.

* - Fix format issues
- Remove duplicate set kqv_out to llm_build_kv

* Update llama.cpp

---------

Co-authored-by: willhe <willhe@xverse.cn>
Co-authored-by: willhe <hexin@xverse.cn>

2024-03-29 14:37:03 +01:00

__init__.py

gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981 )

2023-11-11 08:04:50 +03:00

constants.py

[Model] Add support for xverse (#6301 )

2024-03-29 14:37:03 +01:00

gguf_reader.py

gguf : add support for I64 and F64 arrays (#6062 )

2024-03-15 10:46:51 +02:00

gguf_writer.py

llama : add Command-R support (#6033 )