llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-12 19:37:53 -04:00

Files

Georgi Gerganov 3600cc2886 llama : use n_swa + n_ubatch cells for SWA cache (#13833 )

* llama : use n_swa + n_ubatch cells for SWA cache

ggml-ci

* llama : add warning about multi-sqeuence SWA contexts

2025-05-31 15:57:44 +03:00

llama-cpp.h

2025-01-12 11:32:42 +02:00

llama.h

2025-05-31 15:57:44 +03:00