mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-07-23 19:25:51 +00:00
Its FFN size is 5460 which is not convenient. The offending tensors are kept in F16, which makes the final model 5.01 bpw.
Its FFN size is 5460 which is not convenient. The offending tensors are kept in F16, which makes the final model 5.01 bpw.