llama.cpp

History

Atharva Dubey 663445b0de sycl: quantize and reorder the input to q8_1 when reorder is enabled (#13826 ) * [WIP]: fuse q8 quantization and reorder * wip2: fuse q8 quantization and reorder * working q8 reorder commit * restored common.hpp * remove debug prints * remove unnecessary headers and remove trailing whitespace * Update ggml/src/ggml-sycl/ggml-sycl.cpp Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@intel.com> --------- Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@intel.com>		2025-06-02 10:12:20 +01:00
..
cmake	cmake: Factor out CPU architecture detection (#13883 )	2025-05-29 12:50:25 +02:00
include	ggml : remove ggml_graph_import and ggml_graph_export declarations (ggml/1247)	2025-06-01 13:43:57 +03:00
src	sycl: quantize and reorder the input to q8_1 when reorder is enabled (#13826 )	2025-06-02 10:12:20 +01:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	vulkan: use timestamp queries for GGML_VULKAN_PERF (#13817 )	2025-05-27 18:39:07 +02:00