llama.cpp/ggml/src/ggml-vulkan
Daniele cf2270e4d3
vulkan: subgroup size tuning (#12087)
* vulkan: subgroup size test

* Vulkan: Add device architecture enum and logic to recognize AMD generations

* vulkan: use new architecture logic to specify subgroup size

* Initial vulkan subgroup size tuning for RDNA3

* vulkan: commonize RDNA subgroup tuning

* vulkan: override subgroup size if required_subgroup_size = 0

* vulkan: disable warp 32 for RDNA3

* vulkan: fine tuned RDNA1 subgroup sizes

* vulkan: adjusted subgroup size map

* vulkan: fixed RDNA2 subgroup map

---------

Co-authored-by: 0cc4m <picard12@live.de>
2025-03-17 12:42:33 +01:00
..
cmake fix: ggml: fix vulkan-shaders-gen build (#10448) 2025-01-15 14:17:42 +01:00
vulkan-shaders vulkan: use fp32 in coopmat2 q4_k dequant function (#12309) 2025-03-17 10:43:35 +01:00
CMakeLists.txt fix: ggml: fix vulkan-shaders-gen build (#10448) 2025-01-15 14:17:42 +01:00
ggml-vulkan.cpp vulkan: subgroup size tuning (#12087) 2025-03-17 12:42:33 +01:00