Commit Graph

212 Commits

Author SHA1 Message Date
a2c6fd747c scripts : sync update 2024-11-07 23:07:55 +02:00
ce027adfb3 sync : ggml 2024-11-04 10:33:37 +02:00
815fe72adc sync : ggml 2024-11-01 10:28:24 +02:00
c5b0f4b5d9 llama : refactor model loader with backend registry (#10026) 2024-10-30 02:01:23 +01:00
8d8ff71536 llama : remove Tail-Free sampling (#10071)
ggml-ci
2024-10-29 10:42:05 +02:00
cc2983d375 sync : ggml 2024-10-26 10:34:08 +03:00
9e4a2563ea scripts : fix amx sync [no ci] 2024-10-26 10:33:31 +03:00
190a37d797 sync : ggml 2024-10-23 17:23:55 +03:00
17bb928080 readme : remove --memory-f32 references (#9925) 2024-10-17 23:43:05 +03:00
0e41b300ed sync : ggml 2024-10-16 11:28:14 +03:00
fa42aa6d89 scripts : fix spelling typo in messages and comments (#9782)
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2024-10-08 09:19:53 +03:00
b6d6c5289f sync : llama.cpp 2024-10-06 12:53:28 +03:00
58b16695e1 sync : ggml 2024-10-05 15:53:49 +03:00
17880771ad sync : ggml 2024-10-04 18:50:25 +03:00
1bb8a64ebf sync : ggml 2024-10-03 21:17:49 +03:00
c83ad6d01e ggml-backend : add device and backend reg interfaces (#9707)
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2024-10-03 01:49:47 +02:00
f1b8c42711 sync : ggml 2024-10-01 16:09:42 +03:00
d0b1d663e4 sync : ggml 2024-09-29 21:16:07 +03:00
bb5f819975 sync : ggml 2024-09-24 11:01:18 +03:00
4301535326 sync : ggml
ggml-ci
2024-09-20 21:15:05 +03:00
0d2f22e45c scripts : verify py deps at the start of compare (#9520) 2024-09-18 18:34:32 +03:00
385decbd63 sync : ggml 2024-09-08 11:05:55 +03:00
60a3107ccd scripts : option to increase git patch context 2024-09-08 11:05:55 +03:00
231cff5f6f sync : ggml 2024-08-27 22:41:27 +03:00
4305b57c80 sync : ggml 2024-08-09 10:03:48 +03:00
afd27f01fe scripts : sync cann files (#0) 2024-08-08 14:56:52 +03:00
366d486c16 scripts : fix sync filenames (#0) 2024-08-08 14:40:12 +03:00
e44a561ab0 sync : ggml 2024-08-08 13:19:47 +03:00
5587e57a76 sync : ggml
ggml-ci
2024-08-05 08:50:57 +03:00
5e2727fe03 scripts : sync vulkan-shaders (#0) 2024-07-27 18:08:47 +03:00
56f20aa25d scripts : sync ggml-aarch64 sources 2024-07-27 18:07:33 +03:00
ae7985cd7b sync : ggml
ggml-ci
2024-07-27 17:43:44 +03:00
3f2d538b81 scripts : fix sync for sycl 2024-07-08 13:51:31 +03:00
2ee44c9a18 sync : ggml
ggml-ci
2024-07-08 12:23:00 +03:00
3fd62a6b1c py : type-check all Python scripts with Pyright (#8341)
* py : type-check all Python scripts with Pyright

* server-tests : use trailing slash in openai base_url

* server-tests : add more type annotations

* server-tests : strip "chat" from base_url in oai_chat_completions

* server-tests : model metadata is a dict

* ci : disable pip cache in type-check workflow

The cache is not shared between branches, and it's 250MB in size,
so it would become quite a big part of the 10GB cache limit of the repo.

* py : fix new type errors from master branch

* tests : fix test-tokenizer-random.py

Apparently, gcc applies optimisations even when pre-processing,
which confuses pycparser.

* ci : only show warnings and errors in python type-check

The "information" level otherwise has entries
from 'examples/pydantic_models_to_grammar.py',
which could be confusing for someone trying to figure out what failed,
considering that these messages can safely be ignored
even though they look like errors.
2024-07-07 15:04:39 -04:00
e235b267a2 py : switch to snake_case (#8305)
* py : switch to snake_case

ggml-ci

* cont

ggml-ci

* cont

ggml-ci

* cont : fix link

* gguf-py : use snake_case in scripts entrypoint export

* py : rename requirements for convert_legacy_llama.py

Needed for scripts/check-requirements.sh

---------

Co-authored-by: Francis Couture-Harpin <git@compilade.net>
2024-07-05 07:53:33 +03:00
821922916f fix: Update script paths in CI scripts 2024-07-04 15:39:13 +00:00
07a3fc0608 Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (#8258) 2024-07-02 12:18:10 -04:00
c70d117c37 scripts : fix filename sync 2024-06-26 23:25:22 +03:00
f2d48fffde sync : ggml 2024-06-26 19:39:19 +03:00
f3f65429c4 llama : reorganize source code + improve CMake (#8006)
* scripts : update sync [no ci]

* files : relocate [no ci]

* ci : disable kompute build [no ci]

* cmake : fixes [no ci]

* server : fix mingw build

ggml-ci

* cmake : minor [no ci]

* cmake : link math library [no ci]

* cmake : build normal ggml library (not object library) [no ci]

* cmake : fix kompute build

ggml-ci

* make,cmake : fix LLAMA_CUDA + replace GGML_CDEF_PRIVATE

ggml-ci

* move public backend headers to the public include directory (#8122)

* move public backend headers to the public include directory

* nix test

* spm : fix metal header

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* scripts : fix sync paths [no ci]

* scripts : sync ggml-blas.h [no ci]

---------

Co-authored-by: slaren <slarengh@gmail.com>
2024-06-26 18:33:02 +03:00
37bef89433 tokenizer : BPE fixes (#7530)
* Random test: add_bos_token, add_eos_token
* Random test: add BPE models for testing
* Custom regex split fails with codepoint 0
* Fix falcon punctuation regex
* Refactor llm_tokenizer_bpe: move code to constructor
* Move 'add_special_bos/eos' logic to llm_tokenizer_bpe
* Move tokenizer flags to vocab structure.
* Default values for special_add_bos/eos
* Build vocab.special_tokens_cache using vocab token types
* Generalize 'jina-v2' per token attributes
* Fix unicode whitespaces (deepseek-coder, deepseek-llm)
* Skip missing byte tokens (falcon)
* Better unicode data generation
* Replace char32_t with uint32_t
2024-06-18 18:40:52 +02:00
5326bcceeb ggml : sync 2024-06-18 09:50:45 +03:00
1c641e6aac build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809)
* `main`/`server`: rename to `llama` / `llama-server` for consistency w/ homebrew

* server: update refs -> llama-server

gitignore llama-server

* server: simplify nix package

* main: update refs -> llama

fix examples/main ref

* main/server: fix targets

* update more names

* Update build.yml

* rm accidentally checked in bins

* update straggling refs

* Update .gitignore

* Update server-llm.sh

* main: target name -> llama-cli

* Prefix all example bins w/ llama-

* fix main refs

* rename {main->llama}-cmake-pkg binary

* prefix more cmake targets w/ llama-

* add/fix gbnf-validator subfolder to cmake

* sort cmake example subdirs

* rm bin files

* fix llama-lookup-* Makefile rules

* gitignore /llama-*

* rename Dockerfiles

* rename llama|main -> llama-cli; consistent RPM bin prefixes

* fix some missing -cli suffixes

* rename dockerfile w/ llama-cli

* rename(make): llama-baby-llama

* update dockerfile refs

* more llama-cli(.exe)

* fix test-eval-callback

* rename: llama-cli-cmake-pkg(.exe)

* address gbnf-validator unused fread warning (switched to C++ / ifstream)

* add two missing llama- prefixes

* Updating docs for eval-callback binary to use new `llama-` prefix.

* Updating a few lingering doc references for rename of main to llama-cli

* Updating `run-with-preset.py` to use new binary names.
Updating docs around `perplexity` binary rename.

* Updating documentation references for lookup-merge and export-lora

* Updating two small `main` references missed earlier in the finetune docs.

* Update apps.nix

* update grammar/README.md w/ new llama-* names

* update llama-rpc-server bin name + doc

* Revert "update llama-rpc-server bin name + doc"

This reverts commit e474ef1df4.

* add hot topic notice to README.md

* Update README.md

* Update README.md

* rename gguf-split & quantize bins refs in **/tests.sh

---------

Co-authored-by: HanClinto <hanclinto@gmail.com>
2024-06-13 00:41:52 +01:00
1442677f92 common : refactor cli arg parsing (#7675)
* common : gpt_params_parse do not print usage

* common : rework usage print (wip)

* common : valign

* common : rework print_usage

* infill : remove cfg support

* common : reorder args

* server : deduplicate parameters

ggml-ci

* common : add missing header

ggml-ci

* common : remote --random-prompt usages

ggml-ci

* examples : migrate to gpt_params

ggml-ci

* batched-bench : migrate to gpt_params

* retrieval : migrate to gpt_params

* common : change defaults for escape and n_ctx

* common : remove chatml and instruct params

ggml-ci

* common : passkey use gpt_params
2024-06-04 21:23:39 +03:00
554c247caf ggml : remove OpenCL (#7735)
ggml-ci
2024-06-04 21:23:20 +03:00
adc9ff3841 llama-bench : allow using a different printer for stderr with -oe (#7722)
compare-commits.sh : hide stdout, use -oe to print markdown
2024-06-04 14:32:42 +02:00
c8047d538f scripts: update compare_llama_bench.py [no ci] (#7673) 2024-05-31 16:26:21 +02:00
9c4c9cc83f Move convert.py to examples/convert-legacy-llama.py (#7430)
* Move convert.py to examples/convert-no-torch.py

* Fix CI, scripts, readme files

* convert-no-torch -> convert-legacy-llama

* Move vocab thing to vocab.py

* Fix convert-no-torch -> convert-legacy-llama

* Fix lost convert.py in ci/run.sh

* Fix imports

* Fix gguf not imported correctly

* Fix flake8 complaints

* Fix check-requirements.sh

* Get rid of ADDED_TOKENS_FILE, FAST_TOKENIZER_FILE

* Review fixes
2024-05-30 21:40:00 +10:00
00281b7be3 scripts : remove mpi remnants 2024-05-29 14:31:18 +03:00