mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-07-26 11:13:53 -04:00
* Single allocation of encode_async block with non-ARC capture in ggml-metal.m * Moving Block_release to the deallocation code * Release encode block when re-setting encoding buffer count if needed * Update ggml/src/ggml-metal.m --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>