mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-08-13 20:07:41 -04:00
CANN: Improve loading efficiency after converting weights to NZ format. (#14985)
* CANN: Improve loading efficiency after converting weights to NZ format. * CANN: fix typo
This commit is contained in:
@@ -310,5 +310,7 @@ Specifies the memory pool management strategy:
|
||||
|
||||
Controls automatic cleanup of the memory pool. This option is only effective when using the prio or leg memory pool strategies.
|
||||
|
||||
## TODO
|
||||
- Support more models and data types.
|
||||
### GGML_CANN_WEIGHT_NZ
|
||||
|
||||
Converting the matmul weight format from ND to NZ can significantly improve performance on the 310I DUO NPU.
|
||||
|
||||
|
Reference in New Issue
Block a user