Commit Graph

335 Commits

Author SHA1 Message Date
Jeff Bolz
53903ae6fa vulkan: increase timeout for CI (#14574) 2025-07-08 09:38:31 +02:00
Georgi Gerganov
d4cdd9c1c3 ggml : remove kompute backend (#14501)
ggml-ci
2025-07-03 07:48:32 +03:00
Rotem Dan
f3ed38d793 Set RPATH to "@loader_path" / "$ORIGIN" to ensure executables and dynamic libraries search for dependencies in their origin directory. (#14309) 2025-07-02 18:37:16 +02:00
Sigbjørn Skjæret
611ba4b264 ci : add OpenCL to labeler workflow (#14496) 2025-07-02 09:02:51 +02:00
Eric Zhang
85841e121d github : add OpenCL backend to issue templates (#14492) 2025-07-02 08:41:35 +03:00
Georgi Gerganov
de56944147 ci : disable fast-math for Metal GHA CI (#14478)
* ci : disable fast-math for Metal GHA CI

ggml-ci

* cont : remove -g flag

ggml-ci
2025-07-01 18:04:08 +03:00
Sigbjørn Skjæret
6609507a91 ci : fix windows build and release (#14431) 2025-06-28 09:57:07 +02:00
bandoti
ce82bd0117 ci: add workflow for relocatable cmake package (#14346) 2025-06-23 15:30:51 -03:00
Jeff Bolz
bf2a99e3cb vulkan: update windows SDK in release.yml (#14344) 2025-06-23 15:44:48 +02:00
Jeff Bolz
3a9457df96 vulkan: update windows SDK in CI (#14334) 2025-06-23 10:19:24 +02:00
Diego Devesa
6adc3c3ebc llama : add thread safety test (#14035)
* llama : add thread safety test

* llamafile : remove global state

* llama : better LLAMA_SPLIT_MODE_NONE logic

when main_gpu < 0 GPU devices are not used

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-06-16 08:11:43 -07:00
bandoti
0dbcabde8c cmake: clean up external project logic for vulkan-shaders-gen (#14179)
* Remove install step for vulkan-shaders-gen

* Add install step to normalize msvc with make

* Regenerate modified shaders at build-time
2025-06-16 10:32:13 -03:00
Jeff Bolz
652b70e667 vulkan: force device 0 in CI (#14106) 2025-06-10 10:53:47 -05:00
Diego Devesa
7f4fbe5183 llama : allow building all tests on windows when not using shared libs (#13980)
* llama : allow building all tests on windows when not using shared libraries

* add static windows build to ci

* tests : enable debug logs for test-chat

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-06-09 20:03:09 +02:00
Yuanhao Ji
056eb74534 CANN: Enable labeler for Ascend NPU (#13914) 2025-06-09 11:20:06 +08:00
吴小白
5787b5da57 ci: add LoongArch cross-compile build (#13944) 2025-06-07 10:39:11 -03:00
Diego Devesa
2589ad3704 ci : remove cuda 11.7 releases, switch runner to windows 2022 (#13997) 2025-06-04 15:37:40 +02:00
Diego Devesa
482548716f releases : use dl backend for linux release, remove arm64 linux release (#13996) 2025-06-04 13:15:54 +02:00
bandoti
d98f2a35fc ci: disable LLAMA_CURL for Linux cross-builds (#13871) 2025-05-28 15:46:47 -03:00
Diego Devesa
a2d02d5793 releases : bundle llvm omp library in windows release (#13763) 2025-05-25 00:55:16 +02:00
Diego Devesa
17fc817b58 releases : enable openmp in windows cpu backend build (#13756) 2025-05-24 22:27:03 +02:00
Diego Devesa
b775345d78 ci : enable winget package updates (#13734) 2025-05-23 23:14:00 +03:00
Diego Devesa
a70a8a69c2 ci : add winget package updater (#13732) 2025-05-23 22:09:38 +02:00
Diego Devesa
3079e9ac8e release : fix windows hip release (#13707)
* release : fix windows hip release

* make single hip release with multiple targets
2025-05-23 00:21:37 +02:00
Diego Devesa
d643bb2c79 releases : build CPU backend separately (windows) (#13642) 2025-05-21 22:09:57 +02:00
R0CKSTAR
33983057d0 musa: Upgrade MUSA SDK version to rc4.0.1 and use mudnn::Unary::IDENTITY op to accelerate D2D memory copy (#13647)
* musa: fix build warning (unused parameter)

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* musa: upgrade MUSA SDK version to rc4.0.1

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* musa: use mudnn::Unary::IDENTITY op to accelerate D2D memory copy

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* Update ggml/src/ggml-cuda/cpy.cu

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

* musa: remove MUDNN_CHECK_GEN and use CUDA_CHECK_GEN instead in MUDNN_CHECK

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

---------

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2025-05-21 09:58:49 +08:00
Alberto Cabrera Pérez
f71f40a284 ci : upgraded oneAPI version in SYCL workflows and dockerfile (#13532) 2025-05-19 11:46:09 +01:00
Diego Devesa
415e40a357 releases : use arm version of curl for arm releases (#13592) 2025-05-16 19:36:51 +02:00
Sigbjørn Skjæret
7c07ac244d ci : add ppc64el to build-linux-cross (#13575) 2025-05-16 14:54:23 +02:00
Thammachart Chinvarapon
b064a51a4e ci: free_disk_space flag enabled for intel variant (#13426)
before cleanup: 20G
after cleanup: 44G
after all built and pushed: 24G

https://github.com/Thammachart/llama.cpp/actions/runs/14945093573/job/41987371245
2025-05-10 16:34:48 +02:00
Jeff Bolz
dc1d2adfc0 vulkan: scalar flash attention implementation (#13324)
* vulkan: scalar flash attention implementation

* vulkan: always use fp32 for scalar flash attention

* vulkan: use vector loads in scalar flash attention shader

* vulkan: remove PV matrix, helps with register usage

* vulkan: reduce register usage in scalar FA, but perf may be slightly worse

* vulkan: load each Q value once. optimize O reduction. more tuning

* vulkan: support q4_0/q8_0 KV in scalar FA

* CI: increase timeout to accommodate newly-supported tests

* vulkan: for scalar FA, select between 1 and 8 rows

* vulkan: avoid using Float16 capability in scalar FA
2025-05-10 08:07:07 +02:00
Diego Devesa
15e03282bb ci : limit write permission to only the release step + fixes (#13392)
* ci : limit write permission to only the release step

* fix win cuda file name

* fix license file copy on multi-config generators
2025-05-08 23:45:22 +02:00
Diego Devesa
70a6991edf ci : move release workflow to a separate file (#13362) 2025-05-08 13:15:28 +02:00
Diego Devesa
814f795e06 docker : disable arm64 and intel images (#13356) 2025-05-07 16:36:33 +02:00
Diego Devesa
9f2da5871f llama : build windows releases with dl backends (#13220) 2025-05-04 14:20:49 +02:00
Diego Devesa
1d36b3670b llama : move end-user examples to tools directory (#13249)
* llama : move end-user examples to tools directory

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-05-02 20:27:13 +02:00
bandoti
d24d592808 ci: fix cross-compile sync issues (#12804) 2025-05-01 19:06:39 -03:00
bandoti
00137157fc Disable CI cross-compile builds (#13022) 2025-04-19 18:05:03 +02:00
hipudding
54a7272043 CANN: Add x86 build ci (#12950)
* CANN: Add x86 build ci

* CANN: fix code format
2025-04-15 12:08:55 +01:00
R0CKSTAR
8ac9f5d765 ci : Replace freediskspace to free_disk_space in docker.yml (#12861)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-04-11 09:26:17 +02:00
R0CKSTAR
d9a63b2f2e musa: enable freediskspace for docker image build (#12839)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-04-09 11:22:30 +02:00
Chenguang Li
6e1c4cebdb CANN: Support Opt CONV_TRANSPOSE_1D and ELU (#12786)
* [CANN] Support ELU and CONV_TRANSPOSE_1D

* [CANN]Modification review comments

* [CANN]Modification review comments

* [CANN]name adjustment

* [CANN]remove lambda used in template

* [CANN]Use std::func instead of template

* [CANN]Modify the code according to the review comments

---------

Signed-off-by: noemotiovon <noemotiovon@gmail.com>
2025-04-09 14:04:14 +08:00
Xuan-Son Nguyen
bd3f59f812 cmake : enable curl by default (#12761)
* cmake : enable curl by default

* no curl if no examples

* fix build

* fix build-linux-cross

* add windows-setup-curl

* fix

* shell

* fix path

* fix windows-latest-cmake*

* run: include_directories

* LLAMA_RUN_EXTRA_LIBS

* sycl: no llama_curl

* no test-arg-parser on windows

* clarification

* try riscv64 / arm64

* windows: include libcurl inside release binary

* add msg

* fix mac / ios / android build

* will this fix xcode?

* try clearing the cache

* add bunch of licenses

* revert clear cache

* fix xcode

* fix xcode (2)

* fix typo
2025-04-07 13:35:19 +02:00
bandoti
1be76e4620 ci: add Linux cross-compile build (#12428) 2025-04-04 14:05:12 -03:00
0cc4m
a8a1f33567 Vulkan: Add DP4A MMQ and Q8_1 quantization shader (#12135)
* Vulkan: Add DP4A MMQ and Q8_1 quantization shader

* Add q4_0 x q8_1 matrix matrix multiplication support

* Vulkan: Add int8 coopmat MMQ support

* Vulkan: Add q4_1, q5_0 and q5_1 quants, improve integer dot code

* Add GL_EXT_integer_dot_product check

* Remove ggml changes, fix mmq pipeline picker

* Remove ggml changes, restore Intel coopmat behaviour

* Fix glsl compile attempt when integer vec dot is not supported

* Remove redundant code, use non-saturating integer dot, enable all matmul sizes for mmq

* Remove redundant comment

* Fix integer dot check

* Fix compile issue with unsupported int dot glslc

* Update Windows build Vulkan SDK version
2025-03-31 14:37:01 +02:00
Guus Waals
0fd8487b14 Fix visionOS build and add CI (#12415)
* ci: add visionOS build workflow

Add a new GitHub Actions workflow for building on visionOS with CMake and Xcode.

* ggml: Define _DARWIN_C_SOURCE for visionOS to fix missing u_xxx typedefs

* ci: remove define hacks for u_xxx system types

---------

Co-authored-by: Giovanni Petrantoni <7008900+sinkingsugar@users.noreply.github.com>
2025-03-19 11:15:23 +01:00
Daniel Bevenius
7b61bcc87c ci : add --symlinks to xcframework zip command (#12409)
This commit adds the --symlinks option to the zip command used to create
the xcframework zip file. This is necessary to create symlinks in the
zip file. Without this option,  the Versions symlink is stored as a
regular directory entry in the zip file, rather than as a symlink in the
zip which causes the followig error in xcode:
```console
Couldn't resolve framework symlink for '/Users/danbev/work/ai/llama.cpp/tmp_1/build-apple/llama.xcframework/macos-arm64_x86_64/llama.framework/Versions/Current': readlink(/Users/danbev/work/ai/llama.cpp/tmp_1/build-apple/llama.xcframework/macos-arm64_x86_64/llama.framework/Versions/Current): Invalid argument (22)
```

Refs: https://github.com/ggml-org/llama.cpp/pull/11996#issuecomment-2727026377
2025-03-16 18:22:05 +01:00
Oscar Barenys
f08f4b3187 Update build.yml for Windows Vulkan builder to use Vulkan 1.4.304 SDK for VK_NV_cooperative_matrix2 support (#12301) 2025-03-12 20:06:58 +01:00
David Huang
f1648e91cf HIP: fix rocWMMA build flags under Windows (#12230) 2025-03-07 08:06:08 +01:00
David Huang
3ffbbd5ce1 HIP: rocWMMA documentation and enabling in workflow builds (#12179)
* Enable rocWMMA for Windows CI build

* Enable for Ubuntu

* GGML_HIP_ROCWMMA_FATTN documentation work
2025-03-06 14:14:11 +01:00