ab6ab8f809
rpc : send hash when tensor data is above some fixed threshold ( #12496 )
...
* rpc : send hash when tensor data is above some fixed threshold
ref #10095
* rpc : put cache under $HOME/.cache/llama.cpp
* try to fix win32 build
* another try to fix win32 build
* remove llama as dependency
2025-03-28 08:18:04 +02:00
86bf31cfe6
rpc-server : add support for the SYCL backend ( #10934 )
2024-12-23 10:39:30 +02:00
9f40989351
ggml : move CPU backend to a separate file ( #10144 )
2024-11-03 19:34:08 +01:00
0e9f760eb1
rpc : add backend registry / device interfaces ( #9812 )
...
* rpc : add backend registry / device interfaces
* llama : add llama_supports_rpc API
* ggml_backend_rpc_start_rpc_server -> ggml_backend_rpc_start_server
2024-10-10 20:14:55 +02:00
841713e1e4
rpc : enable vulkan ( #9714 )
...
closes #8536
2024-10-03 13:00:52 +03:00
b72942fac9
Merge commit from fork
2024-08-09 23:03:21 +03:00
fe1e3917cf
Revert "[SYCL] Update rpc-server.cpp to include SYCL backend ( #7682 )" ( #7808 )
...
This reverts commit 9422c5e34b
.
2024-06-09 01:43:39 +02:00
9422c5e34b
[SYCL] Update rpc-server.cpp to include SYCL backend ( #7682 )
...
* Update rpc-server.cpp to include SYCL backend
Draft PR to address inclusion of SYCL backend for RPC server
* Update rpc-server.cpp
2024-06-02 12:13:54 +03:00
f4bd8b3d26
rpc : set SO_REUSEADDR for the server socket ( #7320 )
...
ref: #7293
2024-05-17 17:25:44 +03:00
9afdffe70e
rpc : get available mem for the CPU backend
...
This can be overridden with the -m command line option
ref: #7293
2024-05-16 12:04:08 +03:00
3b3963c55c
rpc : add command line arg for specifying backend memory
...
ref: #7293
2024-05-16 09:58:29 +03:00
5e31828d3e
ggml : add RPC backend ( #6829 )
...
* ggml : add RPC backend
The RPC backend proxies all operations to a remote server which runs a
regular backend (CPU, CUDA, Metal, etc).
* set TCP_NODELAY
* add CI workflows
* Address review comments
* fix warning
* implement llama_max_devices() for RPC
* Address review comments
* Address review comments
* wrap sockfd into a struct
* implement get_alignment and get_max_size
* add get_device_memory
* fix warning
* win32 support
* add README
* readme : trim trailing whitespace
* Address review comments
* win32 fix
* Address review comments
* fix compile warnings on macos
2024-05-14 14:27:19 +03:00