llama.cpp/ggml/src/ggml-cann
hipudding 7a395f67a7
CANN: Add support for async operator submission (#12864)
Submit operators using asynchronous threads to improve performance.

Use the environment variable GGML_CANN_ASYNC_MODE to control whether
asynchronous submission is enabled. It is disabled by default.

Testing shows a 10%–20% performance improvement in scenarios with
small parameter sizes, especially in quantized models.
2025-04-17 20:34:16 +08:00
..
acl_tensor.cpp CANN: Support more ops (#12841) 2025-04-10 08:51:52 +08:00
acl_tensor.h CANN: Fix failed test cases (#12708) 2025-04-03 08:49:51 +08:00
aclnn_ops.cpp CANN: Add support for async operator submission (#12864) 2025-04-17 20:34:16 +08:00
aclnn_ops.h CANN: Add support for async operator submission (#12864) 2025-04-17 20:34:16 +08:00
CMakeLists.txt [CANN] get_rows and dup optimization (#12671) 2025-04-02 15:22:13 +08:00
common.h CANN: Add support for async operator submission (#12864) 2025-04-17 20:34:16 +08:00
Doxyfile cann : fix doxy (ggml/0) 2024-09-08 11:05:55 +03:00
ggml-cann.cpp CANN: Add support for async operator submission (#12864) 2025-04-17 20:34:16 +08:00