llama.cpp

History

hipudding 7a395f67a7 CANN: Add support for async operator submission (#12864 ) Submit operators using asynchronous threads to improve performance. Use the environment variable GGML_CANN_ASYNC_MODE to control whether asynchronous submission is enabled. It is disabled by default. Testing shows a 10%–20% performance improvement in scenarios with small parameter sizes, especially in quantized models.		2025-04-17 20:34:16 +08:00
..
cmake	scripts : update sync + fix cmake merge	2025-03-27 10:09:29 +02:00
include	ggml : add bilinear upscale support (ggml/1185)	2025-04-11 00:17:47 +03:00
src	CANN: Add support for async operator submission (#12864 )	2025-04-17 20:34:16 +08:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	CUDA/HIP: Share the same unified memory allocation logic. (#12934 )	2025-04-15 11:20:38 +02:00