llama.cpp/minicpmo4.0.md at 2bf3fbf0b54f97aef2b388b76d222789e1c170f1

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-13 20:07:41 -04:00

Files

tc-mb 952a47f455 mtmd : support MiniCPM-V 4.0 (#14983 )

* support minicpm-v 4

* add md

* support MiniCPM-o 4.0

* add default location

* temp rm MiniCPM-o 4.0

* fix code

* fix "minicpmv_projector" default path

2025-07-31 17:22:17 +02:00

1.6 KiB

Raw Blame History

MiniCPM-o 4

Prepare models and code

Download MiniCPM-o-4 PyTorch model from huggingface to "MiniCPM-o-4" folder.

Build llama.cpp

Readme modification time: 20250206

If there are differences in usage, please refer to the official build documentation

Clone llama.cpp:

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp

Build llama.cpp using CMake:

cmake -B build
cmake --build build --config Release

Usage of MiniCPM-o 4

Convert PyTorch model to gguf files (You can also download the converted gguf by us)

python ./tools/mtmd/legacy-models/minicpmv-surgery.py -m ../MiniCPM-o-4
python ./tools/mtmd/legacy-models/minicpmv-convert-image-encoder-to-gguf.py -m ../MiniCPM-o-4 --minicpmv-projector ../MiniCPM-o-4/minicpmv.projector --output-dir ../MiniCPM-o-4/ --minicpmv_version 6
python ./convert_hf_to_gguf.py ../MiniCPM-o-4/model

# quantize int4 version
./build/bin/llama-quantize ../MiniCPM-o-4/model/ggml-model-f16.gguf ../MiniCPM-o-4/model/ggml-model-Q4_K_M.gguf Q4_K_M

Inference on Linux or Mac

# run in single-turn mode
./build/bin/llama-mtmd-cli -m ../MiniCPM-o-4/model/ggml-model-f16.gguf --mmproj ../MiniCPM-o-4/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -p "What is in the image?"

# run in conversation mode
./build/bin/llama-mtmd-cli -m ../MiniCPM-o-4/model/ggml-model-Q4_K_M.gguf --mmproj ../MiniCPM-o-4/mmproj-model-f16.gguf

1.6 KiB Raw Blame History

MiniCPM-o 4

Prepare models and code

Build llama.cpp

Usage of MiniCPM-o 4

1.6 KiB

Raw Blame History