* convert ok, load ok
* warmup ok
* test
* still does not work?
* fix padding
* temporary give up
* fix merge conflict
* build_ultravox()
* rm test
* fix merge conflict
* add necessary mtmd APIs
* first working version (only 4s of audio)
* will this monster compile?
* fix compile
* please compile
* fPIC
* fix windows
* various fixes
* clean up audio_helpers
* fix conversion
* add some debug stuff
* long audio input ok
* adapt the api
* add --audio arg
* final touch UX
* add miniaudio to readme
* fix typo
* refactor kv metadata
* mtmd_default_marker()
* server : (experimental) vision support via libmtmd
* mtmd : add more api around mtmd_image_tokens
* mtmd : add more api around mtmd_image_tokens
* mtmd : ability to calc image hash
* shared_ptr for mtmd_image_tokens
* move hash to user-define ID (fixed)
* abstract out the batch management
* small fix
* refactor logic adding tokens to batch
* implement hashing image
* use FNV hash, now hash bitmap instead of file data
* allow decoding image embedding to be split into batches
* rm whitespace
* disable some features when mtmd is on
* fix --no-mmproj-offload
* mtmd_context_params no timings
* refactor server_inp to server_tokens
* fix the failing test case
* init
* wip
* working version
* add mtmd::bitmaps
* add test target
* rm redundant define
* test: mtmd_input_chunks_free
* rm outdated comment
* fix merging issue
* explicitly create mtmd::input_chunks
* mtmd_input_chunk_copy
* add clone()
* improve server_input struct
* clip : fix confused naming ffn_up and ffn_down
* rm ffn_i/o/g naming
* rename n_embd, n_ff
* small fix
* no check n_ff
* fix detokenize
* add const to various places
* add warning about breaking changes
* add c api
* helper: use mtmd_image_tokens_get_n_pos
* fix ctx_shift
* fix name shadowing
* more strict condition
* support remote image_url
* remote image_url log
* add CI test
* do not log base64
* add "has_multimodal" to /props
* remove dangling image
* speculative: use slot.cache_tokens.insert
* Apply suggestions from code review
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* rm can_be_detokenized
* on prmpt processing done, assert cache_tokens.size
* handle_completions_impl returns void
* adapt the new web ui
* update docs and hot topics
* rm assert
* small fix (2)
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* cmake : do not include ./src as public for libllama
ggml-ci
* cmake : rework tests
ggml-ci
* llguidance : remove unicode include
ggml-ci
* cmake : make c++17 private
ggml-ci
* arg : clean up handling --mmproj with -hf
* rm change about no_mmproj
* Revert "rm change about no_mmproj"
This reverts commit 2cac8e0efb.
* handle no_mmproj explicitly
* skip download mmproj on examples not using it
* mtmd : merge `llava-cli` and `gemma3-cli` into single `mtmd-cli`
* support for minicpmv
* remove cpp files of llava and minicpmv
* update hot topics
* mtmd : add not supported msg for qwen2vl
* Update examples/llava/mtmd.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* common: custom hf endpoint support
Add support for custom huggingface endpoints via HF_ENDPOINT environment variable
You can now specify a custom huggingface endpoint using the HF_ENDPOINT environment variable when using the --hf-repo flag, which works similarly to huggingface-cli's endpoint configuration.
Example usage:
HF_ENDPOINT=https://hf-mirror.com/ ./bin/llama-cli --hf-repo Qwen/Qwen1.5-0.5B-Chat-GGUF --hf-file qwen1_5-0_5b-chat-q2_k.gguf -p "The meaning to life and the universe is"
The trailing slash in the URL is optional:
HF_ENDPOINT=https://hf-mirror.com ./bin/llama-cli --hf-repo Qwen/Qwen1.5-0.5B-Chat-GGUF --hf-file qwen1_5-0_5b-chat-q2_k.gguf -p "The meaning to life and the universe is"
* Update common/arg.cpp
readability Improvement
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
* Apply suggestions from code review
---------
Co-authored-by: ベアトリーチェ <148695646+MakiSonomura@users.noreply.github.com>
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
* (wip) refactor downloading system [no ci]
* fix all examples
* fix mmproj with -hf
* gemma3: update readme
* only handle mmproj in llava example
* fix multi-shard download
* windows: fix problem with std::min and std::max
* fix 2
* added -o option to specify an output file name
* llama-tts returns ENOENT in case of file write error
note : PR #12042 is closed as superseded with this one.
As discussed in PR 'llama-tts : add -o option' (#12042):
* common_params : 'out_file' string is the only output file name parameter left in common_params. It's intended to be used in all example programs implementing an '-o' option.
* cvector-generator, export-lora, imatrix : default output filenames moved from 'common_params' to the 'main()' of each example program.
* Add chat template formatting to -no-cnv
* only enable prompt formatting if explicitly enabled
* add -st / --single-turn
* add --single-turn and -p in conversation mode
* fix -sys + -p
* reword warning
* small readability change and fix (long) outdated example usage
* only activate single turn in conversation mode
This commit adds a preset for llama.vim to use the default Qwen 2.5
Coder models.
The motivation for this change is to make it easier to start a server
suitable to be used with the llama.vim plugin. For example, the server
can be started with a command like the following:
```console
$ llama.vim --fim-qwen-1.5b-default
```
Refs: https://github.com/ggml-org/llama.cpp/issues/10932
This commit adds completion for `--chat-template-file`, enabling only
`.jinja` files to be displayed as completions.
Example usage:
```console
$ ./build/bin/llama-cli --chat-template-file models/templates/<TAB>
models/templates/CohereForAI-c4ai-command-r7b-12-2024-tool_use.jinja
models/templates/CohereForAI-c4ai-command-r-plus-tool_use.jinja
models/templates/deepseek-ai-DeepSeek-R1-Distill-Llama-8B.jinja
models/templates/deepseek-ai-DeepSeek-R1-Distill-Qwen-32B.jinja
models/templates/fireworks-ai-llama-3-firefunction-v2.jinja
models/templates/google-gemma-2-2b-it.jinja
models/templates/llama-cpp-deepseek-r1.jinja
models/templates/meetkai-functionary-medium-v3.1.jinja
models/templates/meetkai-functionary-medium-v3.2.jinja
models/templates/meta-llama-Llama-3.1-8B-Instruct.jinja
models/templates/meta-llama-Llama-3.2-3B-Instruct.jinja
models/templates/meta-llama-Llama-3.3-70B-Instruct.jinja
models/templates/microsoft-Phi-3.5-mini-instruct.jinja
models/templates/mistralai-Mistral-Nemo-Instruct-2407.jinja
models/templates/NousResearch-Hermes-2-Pro-Llama-3-8B-tool_use.jinja
models/templates/NousResearch-Hermes-3-Llama-3.1-8B-tool_use.jinja
models/templates/Qwen-Qwen2.5-7B-Instruct.jinja
```
This is not limited to the models/templates directory, it can be used
anywhere in the filesystem, the above is just an example.