llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-06-27 20:05:20 +00:00

Author	SHA1	Message	Date
Radoslav Gerganov	ab6ab8f809	rpc : send hash when tensor data is above some fixed threshold (#12496 ) * rpc : send hash when tensor data is above some fixed threshold ref #10095 * rpc : put cache under $HOME/.cache/llama.cpp * try to fix win32 build * another try to fix win32 build * remove llama as dependency	2025-03-28 08:18:04 +02:00
Radoslav Gerganov	86bf31cfe6	rpc-server : add support for the SYCL backend (#10934 )	2024-12-23 10:39:30 +02:00
Diego Devesa	9f40989351	ggml : move CPU backend to a separate file (#10144 )	2024-11-03 19:34:08 +01:00
Diego Devesa	0e9f760eb1	rpc : add backend registry / device interfaces (#9812 ) * rpc : add backend registry / device interfaces * llama : add llama_supports_rpc API * ggml_backend_rpc_start_rpc_server -> ggml_backend_rpc_start_server	2024-10-10 20:14:55 +02:00
Radoslav Gerganov	841713e1e4	rpc : enable vulkan (#9714 ) closes #8536	2024-10-03 13:00:52 +03:00
Georgi Gerganov	b72942fac9	Merge commit from fork	2024-08-09 23:03:21 +03:00
slaren	fe1e3917cf	Revert "[SYCL] Update rpc-server.cpp to include SYCL backend (#7682 )" (#7808 ) This reverts commit `9422c5e34b`.	2024-06-09 01:43:39 +02:00
nickp27	9422c5e34b	[SYCL] Update rpc-server.cpp to include SYCL backend (#7682 ) * Update rpc-server.cpp to include SYCL backend Draft PR to address inclusion of SYCL backend for RPC server * Update rpc-server.cpp	2024-06-02 12:13:54 +03:00
Radoslav Gerganov	f4bd8b3d26	rpc : set SO_REUSEADDR for the server socket (#7320 ) ref: #7293	2024-05-17 17:25:44 +03:00
Radoslav Gerganov	9afdffe70e	rpc : get available mem for the CPU backend This can be overridden with the -m command line option ref: #7293	2024-05-16 12:04:08 +03:00
Radoslav Gerganov	3b3963c55c	rpc : add command line arg for specifying backend memory ref: #7293	2024-05-16 09:58:29 +03:00
Radoslav Gerganov	5e31828d3e	ggml : add RPC backend (#6829 ) * ggml : add RPC backend The RPC backend proxies all operations to a remote server which runs a regular backend (CPU, CUDA, Metal, etc). * set TCP_NODELAY * add CI workflows * Address review comments * fix warning * implement llama_max_devices() for RPC * Address review comments * Address review comments * wrap sockfd into a struct * implement get_alignment and get_max_size * add get_device_memory * fix warning * win32 support * add README * readme : trim trailing whitespace * Address review comments * win32 fix * Address review comments * fix compile warnings on macos	2024-05-14 14:27:19 +03:00

12 Commits