R0CKSTAR
3025b621d1
llama-bench: rename DB table name from test to llama_bench ( #15003 )
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
2025-08-02 17:20:40 +08:00
R0CKSTAR
484b2091ce
compare-commits.sh: support both llama-bench and test-backend-ops ( #14392 )
...
* compare-commits.sh: support both llama-bench and test-backend-ops
Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com >
* Speed up the build by specifying -j 12
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* Remove build_number from test-backend-ops db
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* Apply suggestion from @JohannesGaessler
Co-authored-by: Johannes Gäßler <johannesg@5d6.de >
* Refine tool selection logic
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* Address review comments
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
---------
Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com >
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
Co-authored-by: Johannes Gäßler <johannesg@5d6.de >
2025-08-01 08:47:27 +08:00
Georgi Gerganov
e32a4ec60e
sync : ggml
...
ggml-ci
2025-07-30 17:33:11 +03:00
Johannes Gäßler
bbd0f91779
server-bench: make seed choice configurable ( #14929 )
...
* server-bench: make seed choice configurable
* Update scripts/server-bench.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
* Update scripts/server-bench.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
* fix error formatting
* Update scripts/server-bench.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
2025-07-29 10:40:50 +02:00
Georgi Gerganov
1f45f2890e
sync : ggml
2025-07-28 08:15:01 +03:00
Aman Gupta
446595b9b3
Docs: add instructions for adding backends ( #14889 )
2025-07-27 09:36:43 +08:00
Georgi Gerganov
2df255da3c
sync : ggml
...
ggml-ci
2025-07-24 20:27:23 +03:00
Georgi Gerganov
b17230917c
sync : ggml
2025-07-19 11:46:50 +03:00
Johannes Gäßler
5cae766541
scripts: synthetic prompt mode for server-bench.py ( #14695 )
2025-07-16 09:33:28 +02:00
Johannes Gäßler
494c5899cb
scripts: benchmark for HTTP server throughput ( #14668 )
...
* scripts: benchmark for HTTP server throughput
* fix server connection reset
2025-07-14 13:14:30 +02:00
Georgi Gerganov
8eff95544e
sync : ggml
2025-07-12 16:13:27 +03:00
Georgi Gerganov
215535701d
sync : ggml
...
ggml-ci
2025-07-12 14:25:44 +03:00
Aman Gupta
11ee0fea2a
Docs: script to auto-generate ggml operations docs ( #14598 )
...
* Docs: script to auto-generate ggml operations docs
* Review: formatting changes + change github action
* Use built-in types instead of typing
* docs : add BLAS and Metal ops
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
2025-07-10 23:29:01 +08:00
Georgi Gerganov
d4cdd9c1c3
ggml : remove kompute backend ( #14501 )
...
ggml-ci
2025-07-03 07:48:32 +03:00
Georgi Gerganov
e17991c466
sync : ggml
...
ggml-ci
2025-07-02 20:08:45 +03:00
Georgi Gerganov
f61c05d4b1
sync : ggml
...
ggml-ci
2025-07-01 11:06:39 +03:00
Vedran Miletić
e9b6350e61
scripts : make the shell scripts cross-platform ( #14341 )
2025-06-30 10:17:18 +02:00
Georgi Gerganov
06cbedfca1
sync : ggml
...
ggml-ci
2025-06-20 21:02:47 +03:00
Georgi Gerganov
d03172cc79
sync : ggml
...
ggml-ci
2025-06-18 09:59:21 +03:00
Aman Gupta
2e42be42bd
compare-llama-bench: add option to plot ( #14169 )
...
* compare llama-bench: add option to plot
* Address review comments: convert case + add type hints
* Add matplotlib to requirements
* fix tests
* Improve comment and fix assert condition for test
* Add back default test_name, add --plot_log_scale
* use log_scale regardless of x_values
2025-06-14 10:34:20 +02:00
Georgi Gerganov
ae92c1855b
sync : ggml
...
ggml-ci
2025-06-10 18:39:33 +03:00
Georgi Gerganov
b8e2194efc
sync : ggml
...
ggml-ci
2025-06-10 09:21:56 +03:00
Georgi Gerganov
f3a4b1659c
sync : ggml
...
ggml-ci
2025-06-01 13:43:57 +03:00
Georgi Gerganov
53f925074d
sync : vendor ( #13901 )
...
* sync : vendor
ggml-ci
* cont : fix httplib version
ggml-ci
* cont : fix lint
* cont : fix lint
* vendor : move to common folder /vendor
ggml-ci
* cont : fix lint
* cont : move httplib to /vendor + use json_fwd.hpp
ggml-ci
* cont : fix server build
ggml-ci
* cont : add missing headers
ggml-ci
* cont : header clean-up
ggml-ci
2025-05-30 16:25:45 +03:00
Georgi Gerganov
1c49c70d07
sync : ggml
2025-05-27 18:05:33 +03:00
Georgi Gerganov
a26c4cc11e
scripts : add option to compare commits in Debug ( #13806 )
...
* scripts : add option to compare commits in Debug
* cont : reuse existing CMAKE_OPTS
2025-05-26 22:24:01 +03:00
Olivier Chafik
f5cd27b71d
server
: streaming of tool calls and thoughts when --jinja
is on (#12379 )
...
* add common_json w/ support for truncated json healing
* add common_chat_msg_diff
* partial common_chat_parse
* refactor parser w/ optionals
* server: wire chat diffs in stream mode
* fix trigger of thinking models (must happen after thoughts are closed)
* fix functionary v3.2 raw python!
* rename: common_chat_syntax (now contains format)
* rm common_regex.at_start
* don't return empty <think></think>
* accommodate yet another deepseek r1 distill fantasy syntax (`<|tool▁calls|>`)
* fix QwQ 32B tool call parsing after thoughts (hermes2)
* better logs for grammar triggers
* consume spaces after parse_json_tool_calls
* fix required tool calls w/ thinking models that have pre-opened thinking tags
* fix thinking model's initial trigger + test qwq's template
* run most test_tool_call tests in stream + non-stream modes
* make functionary v3.2 parsing more strict (differentiate first match from others)
* send final diff from server, to close off raw python arguments
* support partial content streaming in Generic mode
* tool-call: allow content prelude before hermes2 tool calls (for Qwen2.5)
* Update function-calling.md
* Update tool_bench.py
* chat-parser: remove input from exception (llm output may contain PII)
---------
Co-authored-by: ochafik <ochafik@google.com >
Co-authored-by: Olivier Chafik <ochafik@users.noreply.github.com >
2025-05-25 01:48:08 +01:00
Georgi Gerganov
d30cb5a7fa
sync : ggml
...
ggml-ci
2025-05-19 13:29:56 +03:00
Sigbjørn Skjæret
be1d4a13db
scripts : fix compare-llama-bench.py show parameter ( #13514 )
2025-05-14 08:41:01 +02:00
Sigbjørn Skjæret
bf79371120
scripts : support arbitrary input file formats in compare-llama-bench.py ( #13455 )
2025-05-13 15:31:12 +02:00
Georgi Gerganov
1e2809bc4b
sync : ggml
2025-05-13 14:02:28 +03:00
Sigbjørn Skjæret
09232370fc
scripts : exit compare-llama-bench.py gracefully when there's nothing to compare ( #13451 )
2025-05-11 16:20:39 +02:00
Georgi Gerganov
d879433824
sync : ggml
...
ggml-ci
2025-05-07 17:28:36 +03:00
Diego Devesa
1d36b3670b
llama : move end-user examples to tools directory ( #13249 )
...
* llama : move end-user examples to tools directory
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co >
2025-05-02 20:27:13 +02:00
Georgi Gerganov
b34443923c
sync : ggml ( #13268 )
...
* vulkan : kernels for depthwise 2D convolution (CONV_2D_DW) (ggml/1204)
* vulkan : add kernels for depthwise 2d convolution (OP_CONV_2D_DW)
* review: remove src_x/y < 0 checks; add performance tests
* sync : ggml
ggml-ci
* vulkan : fix lint (#0 )
---------
Co-authored-by: Acly <aclysia@gmail.com >
2025-05-02 20:54:30 +03:00
Georgi Gerganov
b1dd4d08e8
sync : ggml
...
ggml-ci
2025-05-01 20:15:34 +03:00
Georgi Gerganov
8d33d740c3
sync : ggml
2025-05-01 10:00:39 +03:00
Johannes Gäßler
19e899ce21
scripts: n_depth for compare-llama-bench [no ci] ( #13201 )
2025-04-29 23:32:04 +02:00
Georgi Gerganov
63b4911494
sync : ggml
...
ggml-ci
2025-04-24 17:32:47 +03:00
Georgi Gerganov
526739b879
sync : ggml
...
ggml-ci
2025-04-14 09:26:15 +03:00
Georgi Gerganov
47ba87d0a4
sync : ggml
2025-04-11 00:17:47 +03:00
Georgi Gerganov
eb420e1148
sync : ggml
...
ggml-ci
2025-04-11 00:17:47 +03:00
Georgi Gerganov
e4bf72d631
scripts : fix sync-ggml-am.sh
2025-04-11 00:17:47 +03:00
Georgi Gerganov
a4e46e28f9
sync : ggml
...
ggml-ci
2025-04-07 18:44:17 +03:00
Georgi Gerganov
0114a32da0
sync : ggml
...
ggml-ci
2025-03-31 15:07:32 +03:00
Georgi Gerganov
d3f1f0acfb
sync : ggml
...
ggml-ci
2025-03-30 08:33:31 +03:00
Georgi Gerganov
029c693fdc
sync : ggml
...
ggml-ci
2025-03-27 10:09:29 +02:00
Georgi Gerganov
771d84371c
scripts : update sync + fix cmake merge
...
ggml-ci
2025-03-27 10:09:29 +02:00
Georgi Gerganov
df0665a483
sync : ggml
...
ggml-ci
2025-03-27 09:04:38 +02:00
Georgi Gerganov
102ac1891d
sync : ggml
...
ggml-ci
2025-03-07 14:49:44 +02:00