Submit operators using asynchronous threads to improve performance. Use the environment variable GGML_CANN_ASYNC_MODE to control whether asynchronous submission is enabled. It is disabled by default. Testing shows a 10%–20% performance improvement in scenarios with small parameter sizes, especially in quantized models. |
||
|---|---|---|
| .. | ||
| acl_tensor.cpp | ||
| acl_tensor.h | ||
| aclnn_ops.cpp | ||
| aclnn_ops.h | ||
| CMakeLists.txt | ||
| common.h | ||
| Doxyfile | ||
| ggml-cann.cpp | ||