llama.cpp/ggml/src/ggml-vulkan/vulkan-shaders
Jeff Bolz b3e585988f
vulkan: Optimize soft_max (#10301)
* vulkan: Optimize soft_max

Large soft_max could already saturate memory, but small/medium sizes were
pretty slow. The bulk of the gains for them comes from using a smaller
workgroup size, and making the workgroup size match the subgroup size also
makes the barriers much cheaper.

Cache some values in locals to avoid refetching/recomputing. And stamp
out a few "template instantiations" so smaller cases will fully unroll.

Add a missing early return for OOB rows. This happens when there are more
than 512 rows and the dispatch is 512 x H.

* vulkan: Further soft_max optimizations

Restore the workgroup size of 512 case, use it for >1024.

Use unrollable loops for more iteration counts.
2024-11-19 08:25:17 +01:00
..
acc.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
add.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
argsort.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
clamp.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
CMakeLists.txt ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
concat.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
contig_copy.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
copy.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
cos.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_f32.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_funcs.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_head.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_iq4_nl.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q2_k.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q3_k.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q4_0.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q4_1.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q4_k.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q5_0.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q5_1.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q5_k.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q6_k.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q8_0.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
diag_mask_inf.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
div.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
gelu.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
gelu_quick.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
generic_binary_head.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
generic_head.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
generic_unary_head.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
get_rows.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
get_rows_quant.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
group_norm.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
im2col.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
leaky_relu.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
mul.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
mul_mat_split_k_reduce.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
mul_mat_vec.comp vulkan: remove use of null initializer (#10372) 2024-11-18 08:28:42 -06:00
mul_mat_vec_base.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
mul_mat_vec_nc.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
mul_mat_vec_p021.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
mul_mat_vec_q2_k.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
mul_mat_vec_q3_k.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
mul_mat_vec_q4_k.comp vulkan: Optimize some mat-vec mul quant shaders (#10296) 2024-11-16 07:26:57 +01:00
mul_mat_vec_q5_k.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
mul_mat_vec_q6_k.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
mul_mm.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
norm.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
pad.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
pool2d.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
relu.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
repeat.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
rms_norm.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
rope_head.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
rope_neox.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
rope_norm.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
scale.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
silu.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
sin.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
soft_max.comp vulkan: Optimize soft_max (#10301) 2024-11-19 08:25:17 +01:00
square.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
sum_rows.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
tanh.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
timestep_embedding.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
types.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
upscale.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
vulkan-shaders-gen.cpp vulkan: Optimize some mat-vec mul quant shaders (#10296) 2024-11-16 07:26:57 +01:00