CANN: Improve loading efficiency after converting weights to NZ format. (#14985)

* CANN: Improve loading efficiency after converting weights to NZ format.

* CANN: fix typo
This commit is contained in:
hipudding
2025-07-31 19:47:20 +08:00
committed by GitHub
parent 66625a59a5
commit 11490b3672
3 changed files with 70 additions and 58 deletions

View File

@@ -310,5 +310,7 @@ Specifies the memory pool management strategy:
Controls automatic cleanup of the memory pool. This option is only effective when using the prio or leg memory pool strategies.
## TODO
- Support more models and data types.
### GGML_CANN_WEIGHT_NZ
Converting the matmul weight format from ND to NZ can significantly improve performance on the 310I DUO NPU.