Multi GPU support, CUDA refactor, CUDA scratch buffer (#1703)

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-17 13:40:55 -04:00

* CUDA multi GPU + scratch

ggml_cuda_compute_forward

Tensor parallelism

ggml_cuda_add

ggml_cuda_rms_norm

ggml_cuda_silu

CUDA scratch buffer

--main-gpu CLI option

This commit is contained in:

Johannes Gäßler

2023-06-06 21:33:23 +02:00

committed by

GitHub

parent 44f906e853

commit 17366df842

12 changed files with 1221 additions and 544 deletions

1319

ggml-cuda.cu

View File

File diff suppressed because it is too large Load Diff

Multi GPU support, CUDA refactor, CUDA scratch buffer (#1703)

1319 ggml-cuda.cu View File

1319

ggml-cuda.cu

View File