Commit Graph

246 Commits

Author SHA1 Message Date
980ab41afb docker : add gpu image CI builds (#3103)
Enables the GPU enabled container images to be built and pushed
alongside the CPU containers.

Co-authored-by: canardleteer <eris.has.a.dad+github@gmail.com>
2023-09-14 19:47:00 +03:00
1b0d09259e cmake : support build for iOS/tvOS (#3116)
* cmake : support build for iOS/tvOS

* ci : add iOS/tvOS build into macOS-latest-cmake

* ci : split ios/tvos jobs
2023-09-11 19:49:06 +08:00
afc43d5f82 cov : add Code Coverage and codecov.io integration (#2928)
* update .gitignore

* makefile: add coverage support (lcov, gcovr)

* add code-coverage workflow

* update code coverage workflow

* wun on ubuntu 20.04

* use gcc-8

* check why the job hang

* add env vars

* add LLAMA_CODE_COVERAGE=1 again

* - add CODECOV_TOKEN
- add missing make lcov-report

* install lcov

* update make file -pb flag

* remove unused  GGML_NITER from workflows

* wrap coverage output files in COV_TARGETS
2023-09-03 11:48:49 +03:00
0d1c706181 gguf : add workflow for Pypi publishing (#2896)
* gguf : add workflow for Pypi publishing

* gguf : add workflow for Pypi publishing

* fix trailing whitespace
2023-08-30 12:47:40 +03:00
9509294420 make : add test and update CI (#2897)
* build ci: run make test

* makefile:
- add all
- add test

* enable tests/test-tokenizer-0-llama

* fix path to model

* remove gcc-8 from macos build test

* Update Makefile

* Update Makefile
2023-08-30 12:42:51 +03:00
ef955fbd23 Tag release with build number (#2732)
* Modified build.yml to use build number for release

* Add the short hash back into the tag

* Prefix the build number with b
2023-08-24 15:58:02 +02:00
Eve
1fed755b1f ci : add non-AVX scalar build/test (#2356)
* noavx build and test

* we don't need to remove f16c in windows
2023-07-25 15:16:13 +03:00
5656d10599 mpi : add support for distributed inference via MPI (#2099)
* MPI support, first cut

* fix warnings, update README

* fixes

* wrap includes

* PR comments

* Update CMakeLists.txt

* Add GH workflow, fix test

* Add info to README

* mpi : trying to move more MPI stuff into ggml-mpi (WIP) (#2099)

* mpi : add names for layer inputs + prep ggml_mpi_graph_compute()

* mpi : move all MPI logic into ggml-mpi

Not tested yet

* mpi : various fixes - communication now works but results are wrong

* mpi : fix output tensor after MPI compute (still not working)

* mpi : fix inference

* mpi : minor

* Add OpenMPI to GH action

* [mpi] continue-on-error: true

* mpi : fix after master merge

* [mpi] Link MPI C++ libraries to fix OpenMPI

* tests : fix new llama_backend API

* [mpi] use MPI_INT32_T

* mpi : factor out recv / send in functions and reuse

* mpi : extend API to allow usage with outer backends (e.g. Metal)

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-10 18:49:56 +03:00
a7e20edf22 ci : switch threads to 1 (#2138) 2023-07-07 21:23:57 +03:00
1d656d6360 ggml : change ggml_graph_compute() API to not require context (#1999)
* ggml_graph_compute: deprecate using ggml_context, try resolve issue #287

* rewrite: no longer consider backward compitability; plan and make_plan

* minor: rename ctx as plan; const

* remove ggml_graph_compute from tests/test-grad0.c, but current change breaks backward

* add static ggml_graph_compute_sugar()

* minor: update comments

* reusable buffers

* ggml : more consistent naming + metal fixes

* ggml : fix docs

* tests : disable grad / opt + minor naming changes

* ggml : add ggml_graph_compute_with_ctx()

- backwards compatible API
- deduplicates a lot of copy-paste

* ci : enable test-grad0

* examples : factor out plan allocation into a helper function

* llama : factor out plan stuff into a helper function

* ci : fix env

* llama : fix duplicate symbols + refactor example benchmark

* ggml : remove obsolete assert + refactor n_tasks section

* ggml : fix indentation in switch

* llama : avoid unnecessary bool

* ggml : remove comments from source file and match order in header

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-07 19:24:01 +03:00
1b107b8550 ggml : generalize quantize_fns for simpler FP16 handling (#1237)
* Generalize quantize_fns for simpler FP16 handling

* Remove call to ggml_cuda_mul_mat_get_wsize

* ci : disable FMA for mac os actions

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-05 19:13:06 +03:00
698efad5fb CI: make the brew update temporarily optional. (#2092)
until they decide to fix the brew installation in the macos runners.
see the open issues. eg https://github.com/actions/runner-images/pull/7710
2023-07-04 01:50:12 +02:00
e4caa8da59 ci : run when changing only the CUDA sources (#1800) 2023-06-12 20:12:47 +03:00
e7fe66e670 ci : disable auto tidy (#1705) 2023-06-05 23:05:05 +03:00
0df7d63e5b Include server in releases + other build system cleanups (#1610)
Set `LLAMA_BUILD_SERVER` in workflow so the `server` example gets build. This currently only applies to Windows builds because it seems like only Windows binary artifacts are included in releases.

Add `server` example target to `Makefile` (still uses `LLAMA_BUILD_SERVER` define and does not build by default)

Fix issue where `vdot` binary wasn't removed when running `make clean`.

Fix compile warnings in `server` example.

Add `.hpp` files to trigger workflow (the server example has one).
2023-05-27 11:04:14 -06:00
0ecb1bbbeb [CI] Fix openblas (#1613)
* Fix OpenBLAS build

* Fix `LLAMA_BLAS_VENDOR` CMake variable that should be a string and not a boolean.
2023-05-27 17:24:06 +03:00
83c54e6da5 [CI] CLBlast: Fix directory name (#1606) 2023-05-27 14:18:25 +02:00
ac7876ac20 Update CLBlast to 1.6.0 (#1580)
* Update CLBlast to 1.6.0
2023-05-24 10:30:09 +03:00
b8ee340abe feature : support blis and other blas implementation (#1536)
* feature: add blis support

* feature: allow all BLA_VENDOR to be assigned in cmake arguments. align with whisper.cpp pr 927

* fix: version detection for BLA_SIZEOF_INTEGER, recover min version of cmake

* Fix typo in INTEGER

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Fix: blas changes on ci

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-05-20 17:58:31 +03:00
553fd4d4b5 Add clang-tidy reviews to CI (#1407) 2023-05-12 15:40:53 +02:00
e1295513a4 CI: add Windows CLBlast and OpenBLAS builds (#1277)
* Add OpenCL and CLBlast support

* Add OpenBLAS support

* Remove testing from matrix

* change build name to 'clblast'
2023-05-07 13:20:09 +02:00
a3b85b28da ci : add cublas to windows release (#1271) 2023-05-05 22:56:09 +02:00
2ec83428de Fix build for gcc 8 and test in CI (#1154) 2023-04-24 15:38:26 +00:00
857308d1e8 ci : trigger CI for drafts, but not most PR actions (#1125) 2023-04-22 16:12:29 +03:00
7e312f165c cmake : fix build under Windows when enable BUILD_SHARED_LIBS (#1100)
* Fix build under Windows when enable BUILD_SHARED_LIBS

* Make AVX512 test on Windows to build the shared libs
2023-04-22 11:18:20 +03:00
6a9661ea5a ci : remove the LLAMA_ACCELERATE matrix dimension from Ubuntu builds in the CI (#1074)
[Accelerate](https://developer.apple.com/documentation/accelerate) is an Apple framework which can only be used on macOS, and the CMake build [ignores](https://github.com/ggerganov/llama.cpp/blob/master/CMakeLists.txt#L102) the `LLAMA_ACCELERATE` variable when run on non-Apple platforms. This implies setting `LLAMA_ACCELERATE` is a no-op on Ubuntu and can be removed.

This will reduce visual noise in CI check results (in addition to reducing the number of checks we have to run for every PR). Right now every sanitized build is duplicated twice for no good reason (e.g., we have `CI / ubuntu-latest-cmake-sanitizer (ADDRESS, Debug, ON)` and `CI / ubuntu-latest-cmake-sanitizer (ADDRESS, Debug, OFF)`).
2023-04-20 18:15:18 +03:00
5af8e32238 ci : do not run on drafts 2023-04-18 19:57:06 +03:00
8b679987cd Fix whitespace, add .editorconfig, add GitHub workflow (#883) 2023-04-11 19:45:44 +00:00
9cbc404ba6 ci : re-enable AVX512 testing (Windows-MSVC) (#584)
* CI: Re-enable AVX512 testing (Windows-MSVC)

Now with 100% less base64 encoding

* plain __cpuid is enough here
2023-03-29 23:44:39 +03:00
f1217055ea CI: fix subdirectory path globbing (#546)
- Changes in subdirectories will now be detecter properly
- (Windows-MSVC) AVX512 tests temporarily disabled
2023-03-28 22:43:25 +03:00
96f9c0506f ci : make ctest verbose, hopefully we see what is wrong with the sanitizer 2023-03-28 20:01:09 +03:00
34c1072e49 ci: add debug build to sanitizer build matrix (#527) 2023-03-26 15:48:40 +00:00
8c2ec5e21d Add support for linux/arm64 platform during Docker Builds (#514)
* Add support for linux/arm64 platform

* Add platform to versioned builds
2023-03-26 14:48:42 +00:00
19726169b3 CI: Run other sanitizer builds even if one fails (#511)
applies only to sanitizer builds so they wont be cancelled
2023-03-26 00:13:28 +02:00
2f7bf7dd7c CMake / CI additions (#497)
* CMake: Add AVX512 option

* CI: Add AVX/AVX512 builds (Windows)
(AVX512 tests can only be run when the worker happens to support it, building works anyway)

* CMake: Fix sanitizer linkage ( merged #468 )

* CI: Add sanitizer builds (Ubuntu)

* CI: Fix release tagging
(change @zendesk/action-create-release to @anzz1/action-create-release until upstream PR Added commitish as input zendesk/action-create-release#32 is merged)
2023-03-25 23:38:11 +02:00
e4412b45e3 CI: CMake: Separate build and test steps (#376)
* CI: Separate Build and Test steps (CMake)

* CI: Make sure build passes before running tests (CMake)

* CI: Standardise step id names
2023-03-23 04:20:34 +02:00
69c92298a9 Deduplicate q4 quantization functions (#383)
* Deduplicate q4 quantization functions

* Use const; add basic test

* Re-enable quantization test

* Disable AVX2 flags in CI

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-22 19:29:06 +02:00
e6c9e0986c Fix bin dir for win ci 2023-03-22 00:01:08 +02:00
01a297b099 specify build type for ctest on windows (#371) 2023-03-21 23:34:25 +02:00
eb34620aec Add tokenizer test + revert to C++11 (#355)
* Add test-tokenizer-0 to do a few tokenizations - feel free to expand
* Added option to convert-pth-to-ggml.py script to dump just the vocabulary
* Added ./models/ggml-vocab.bin containing just LLaMA vocab data (used for tests)
* Added utility to load vocabulary file from previous point (temporary implementation)
* Avoid using std::string_view and drop back to C++11 (hope I didn't break something)
* Rename gpt_vocab -> llama_vocab
* All CMake binaries go into ./bin/ now
2023-03-21 17:29:41 +02:00
0f1b21cb90 Docker - Fix publish docker image in GitHub Registry (#235)
* fix publish permission

* try to fix docker pipeline using as password github_token & username repository_owner
2023-03-20 18:05:20 +01:00
b2de7f18df CI Improvements (#230)
* CI Improvements

Manual build feature, autoreleases for Windows

* better CI naming convention

use branch name in releases and tags
2023-03-18 09:27:12 +02:00
6b0df5ccf3 add ptread link to fix cmake build under linux (#114)
* add ptread link to fix cmake build under linux

* add cmake to linux and macos platform

* separate make and cmake workflow

---------

Co-authored-by: Sebastián A <sebastian.aedo29@gmail.com>
2023-03-17 13:38:24 -03:00
2af23d3043 🚀 Dockerize llamacpp (#132)
* feat: dockerize llamacpp

* feat: split build & runtime stages

* split dockerfile into main & tools

* add quantize into tool docker image

* Update .devops/tools.sh

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* add docker action pipeline

* change CI to publish at github docker registry

* fix name runs-on macOS-latest is macos-latest (lowercase)

* include docker versioned images

* fix github action docker

* fix docker.yml

* feat: include all-in-one command tool & update readme.md

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-17 10:47:06 +01:00
2f700a2738 Add windows to the CI (#98) 2023-03-13 22:29:10 +02:00
2d555e5b42 Add CI (#60) 2023-03-12 22:08:24 +02:00