CANN: Add support for async operator submission (#12864)
Submit operators using asynchronous threads to improve performance. Use the environment variable GGML_CANN_ASYNC_MODE to control whether asynchronous submission is enabled. It is disabled by default. Testing shows a 10%–20% performance improvement in scenarios with small parameter sizes, especially in quantized models.
This commit is contained in:
parent
971f245b3b
commit
7a395f67a7
4 changed files with 604 additions and 356 deletions
File diff suppressed because it is too large
Load diff
Loading…
Add table
Add a link
Reference in a new issue