llama.cpp

History

uvos 10f2e81809 CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows per block between host and device code. (#12177 ) refactor mmqv to unify the calculation of nwarps and rows per block between host and device code. --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>		2025-03-11 20:16:03 +01:00
..
cmake	cmake: Fix ggml backend dependencies and installation (#11818 )	2025-02-27 09:42:48 +02:00
include	ggml-cpu: Faster IQ1 mul_mat_vec on AVX2 using BMI2 instructions (#12154 )	2025-03-06 02:26:10 +01:00
src	CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows per block between host and device code. (#12177 )	2025-03-11 20:16:03 +01:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	opencl: use OpenCL C standard supported by the device (#12221 )	2025-03-10 09:57:00 -07:00