mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-08-17 13:40:55 -04:00
OpenCL Token Generation Acceleration (#1459)
* Move back to C++ for OpenCL * Refactor OpenCL code to work more like the CUDA code, add missing functions * Deduplicate dequant kernels * Add OpenCL compile options * Use compile args for preprocessing constants * Restore default platform + device selection by id behavior --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de> Co-authored-by: Henri Vasserman <henv@hot.ee>
This commit is contained in:
1034
ggml-opencl.cpp
Normal file
1034
ggml-opencl.cpp
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user