llama.cpp/ggml
uvos 10f2e81809
CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows per block between host and device code. (#12177)
refactor mmqv to unify the calculation of nwarps and rows per block between host and device code.

---------

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2025-03-11 20:16:03 +01:00
..
cmake cmake: Fix ggml backend dependencies and installation (#11818) 2025-02-27 09:42:48 +02:00
include ggml-cpu: Faster IQ1 mul_mat_vec on AVX2 using BMI2 instructions (#12154) 2025-03-06 02:26:10 +01:00
src CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows per block between host and device code. (#12177) 2025-03-11 20:16:03 +01:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt opencl: use OpenCL C standard supported by the device (#12221) 2025-03-10 09:57:00 -07:00