llama.cpp/examples/llava/README-gemma3.md

# Gemma 3 vision

> [!IMPORTANT]
>
> This is very experimental, only used for demo purpose.

## How to get mmproj.gguf?

```bash
cd gemma-3-4b-it
python ../llama.cpp/examples/llava/gemma3_convert_encoder_to_gguf.py .

# output file is mmproj.gguf
```

## How to run it?

What you need:
- The text model GGUF, can be converted using `convert_hf_to_gguf.py`
- The mmproj file from step above
- An image file

```bash
# build
cmake -B build
cmake --build build --target llama-gemma3-cli

# run it
./build/bin/llama-gemma3-cli -m {text_model}.gguf --mmproj mmproj.gguf --image your_image.jpg
```
llama : Add Gemma 3 support (+ experimental vision capability) (#12343) * llama : Add Gemma 3 text-only support * fix python coding style * fix compile on ubuntu * python: fix style * fix ubuntu compile * fix build on ubuntu (again) * fix ubuntu build, finally * clip : Experimental support for Gemma 3 vision (#12344) * clip : Experimental support for Gemma 3 vision * fix build * PRId64 2025-03-12 09:30:24 +01:00			`# Gemma 3 vision`

			`> [!IMPORTANT]`
			`>`
			`> This is very experimental, only used for demo purpose.`

			`## How to get mmproj.gguf?`

			```bash
			`cd gemma-3-4b-it`
			`python ../llama.cpp/examples/llava/gemma3_convert_encoder_to_gguf.py .`

			`# output file is mmproj.gguf`
			```

			`## How to run it?`

			`What you need:`
			- The text model GGUF, can be converted using `convert_hf_to_gguf.py`
			`- The mmproj file from step above`
			`- An image file`

			```bash
			`# build`
			`cmake -B build`
			`cmake --build build --target llama-gemma3-cli`

			`# run it`
			`./build/bin/llama-gemma3-cli -m {text_model}.gguf --mmproj mmproj.gguf --image your_image.jpg`
			```