llama.cpp

History

Alberto Cabrera Pérez 17512a94d6 sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs (#12858 ) * sycl : Implemented reorder Q4_0 mmvq Signed-off-by: Alberto Cabrera <alberto.cabrera@codeplay.com> * sycl : Fixed mmvq being called when reorder is disabled * sycl : Improved comments in the quants header Signed-off-by: Alberto Cabrera <alberto.cabrera@codeplay.com> * Use static_assert * safe_div -> ceil_div * Clarify qi comment * change the reorder tensor from init to execute OP * dbg * Undo changes to test-backend-ops * Refactor changes on top of q4_0 reorder fix * Missing Reverts * Refactored opt_for_reorder logic to simplify code path * Explicit inlining and unroll * Renamed mul_mat_algo enum for consistency --------- Signed-off-by: Alberto Cabrera <alberto.cabrera@codeplay.com> Co-authored-by: romain.biessy <romain.biessy@codeplay.com>		2025-05-09 16:34:08 +01:00
..
cmake	scripts : update sync + fix cmake merge	2025-03-27 10:09:29 +02:00
include	CUDA: fix bad asserts for partial offload (#13337 )	2025-05-06 13:58:51 +02:00
src	sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs (#12858 )	2025-05-09 16:34:08 +01:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	whisper: remove MSVC warnings pragmas (whisper/3090)	2025-05-07 17:28:36 +03:00