llama.cpp

History

lhez 34a846b584 opencl: fix for small models (#11950 ) * opencl: fix small shape gemv, remove unused extensions * opencl: fix `transpose_16`, `dump_tensor`, enforce subgroup size * opencl: fix for token length < 4 * opencl: use wave size of 64 for all Adreno GPUs --------- Co-authored-by: Shawn Gu <quic_shawngu@quicinc.com> Co-authored-by: Skyler Szot <quic_sszot@quicinc.com>		2025-02-24 14:47:07 -07:00
..
embed_kernel.py	Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (#10693 )	2024-12-13 12:23:52 -08:00
ggml-opencl.cl	opencl: fix for small models (#11950 )	2025-02-24 14:47:07 -07:00
ggml-opencl_cvt.cl	Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (#10693 )	2024-12-13 12:23:52 -08:00
ggml-opencl_gemv_noshuffle.cl	opencl: fix for small models (#11950 )	2025-02-24 14:47:07 -07:00
ggml-opencl_gemv_noshuffle_general.cl	opencl: fix for small models (#11950 )	2025-02-24 14:47:07 -07:00
ggml-opencl_mm.cl	Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (#10693 )	2024-12-13 12:23:52 -08:00
ggml-opencl_mul_mat_Ab_Bi_8x4.cl	opencl: fix for small models (#11950 )	2025-02-24 14:47:07 -07:00
ggml-opencl_transpose_16.cl	opencl: fix for small models (#11950 )	2025-02-24 14:47:07 -07:00
ggml-opencl_transpose_32.cl	Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (#10693 )	2024-12-13 12:23:52 -08:00
ggml-opencl_transpose_32_16.cl	Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (#10693 )	2024-12-13 12:23:52 -08:00