6c0501adf7
context : remove logits_all flag
...
ggml-ci
2025-05-08 13:01:34 +03:00
32916a4907
clip : refactor graph builder ( #13321 )
...
* mtmd : refactor graph builder
* fix qwen2vl
* clean up siglip cgraph
* pixtral migrated
* move minicpmv to a dedicated build function
* move max_feature_layer to build_llava
* use build_attn for minicpm resampler
* fix windows build
* add comment for batch_size
* also support tinygemma3 test model
* qwen2vl does not use RMS norm
* fix qwen2vl norm (2)
2025-05-06 22:40:24 +02:00
233461f812
sampling : Integrate Top-nσ into main sampling chain (and add it to the server) ( #13264 )
...
* sampling: add Top-nσ sampler to `llama-server` and sampler ordering
* revert: sampler ordering
* revert: VS' crappy auto-formatting
* revert: VS' crappy auto-formatting pt.2
* revert: my crappy eye sight...
* sampling: add XTC to Top-nσ sampler chain
* sampling: add Dyna. Temp. to Top-nσ sampler chain
* sampling: actually remove Top-nσ from sampler(oops)
* Integrate top_n_sigma into main sampler chain
* Define COMMON_SAMPLER_TYPE_TOP_N_SIGMA
* Formatting
* Lint
* Exit early in the sampler if nsigma < 0
---------
Co-authored-by: CasualAutopsy <casual_autopsy@outlook.com >
2025-05-05 22:12:19 +02:00
b34c859146
server : Webui - change setText command from parent window to also send the message. ( #13309 )
...
* setText command from parent window for llama-vscode now sends the message automatically.
* Upgrade packages versions to fix vulnerabilities with "npm audit fix" command.
* Fix code formatting.
* Add index.html.gz changes.
* Revert "Upgrade packages versions to fix vulnerabilities with "npm audit fix" command."
This reverts commit 67687b7fda
.
* easier approach
* add setTimeout
---------
Co-authored-by: igardev <ivailo.gardev@akros.ch >
Co-authored-by: Xuan Son Nguyen <son@huggingface.co >
2025-05-05 16:03:31 +02:00
9b61acf060
mtmd : rename llava directory to mtmd ( #13311 )
...
* mv llava to mtmd
* change ref everywhere
2025-05-05 16:02:55 +02:00
5215b91e93
clip : fix confused naming ffn_up and ffn_down ( #13290 )
...
* clip : fix confused naming ffn_up and ffn_down
* rm ffn_i/o/g naming
* rename n_embd, n_ff
* small fix
* no check n_ff
2025-05-05 12:54:44 +02:00
27aa259532
mtmd : add C public API ( #13184 )
...
* init
* wip
* working version
* add mtmd::bitmaps
* add test target
* rm redundant define
* test: mtmd_input_chunks_free
* rm outdated comment
* fix merging issue
* explicitly create mtmd::input_chunks
* mtmd_input_chunk_copy
* add clone()
* add const to various places
* add warning about breaking changes
* helper: use mtmd_image_tokens_get_n_pos
2025-05-04 23:43:42 +02:00
9fdfcdaedd
rpc : use backend registry, support dl backends ( #13304 )
2025-05-04 21:25:43 +02:00
86bd60d3fe
llava/mtmd : fixes to fully support dl backends ( #13303 )
2025-05-04 17:05:20 +02:00
3e959f0976
imatrix: fix oob writes if src1 is not contiguous ( #13286 )
2025-05-04 00:50:37 +02:00
36667c8edc
clip : revert the change of BOI/EOI token for GLM-edge ( ⚠️ breaking change) ( #13259 )
2025-05-03 20:07:54 +02:00
1d36b3670b
llama : move end-user examples to tools directory ( #13249 )
...
* llama : move end-user examples to tools directory
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co >
2025-05-02 20:27:13 +02:00