CANN: Add support for async operator submission (#12864)

Submit operators using asynchronous threads to improve performance.

Use the environment variable GGML_CANN_ASYNC_MODE to control whether
asynchronous submission is enabled. It is disabled by default.

Testing shows a 10%–20% performance improvement in scenarios with
small parameter sizes, especially in quantized models.
This commit is contained in:
hipudding 2025-04-17 20:34:16 +08:00 committed by GitHub
parent 971f245b3b
commit 7a395f67a7
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 604 additions and 356 deletions

File diff suppressed because it is too large Load diff