llama.cpp/ggml
Atharva Dubey 663445b0de
sycl: quantize and reorder the input to q8_1 when reorder is enabled (#13826)
* [WIP]: fuse q8 quantization and reorder

* wip2: fuse q8 quantization and reorder

* working q8 reorder commit

* restored common.hpp

* remove debug prints

* remove unnecessary headers and remove trailing whitespace

* Update ggml/src/ggml-sycl/ggml-sycl.cpp

Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@intel.com>

---------

Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@intel.com>
2025-06-02 10:12:20 +01:00
..
cmake cmake: Factor out CPU architecture detection (#13883) 2025-05-29 12:50:25 +02:00
include ggml : remove ggml_graph_import and ggml_graph_export declarations (ggml/1247) 2025-06-01 13:43:57 +03:00
src sycl: quantize and reorder the input to q8_1 when reorder is enabled (#13826) 2025-06-02 10:12:20 +01:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt vulkan: use timestamp queries for GGML_VULKAN_PERF (#13817) 2025-05-27 18:39:07 +02:00