llama.cpp/ggml/src
Atharva Dubey 663445b0de
sycl: quantize and reorder the input to q8_1 when reorder is enabled (#13826)
* [WIP]: fuse q8 quantization and reorder

* wip2: fuse q8 quantization and reorder

* working q8 reorder commit

* restored common.hpp

* remove debug prints

* remove unnecessary headers and remove trailing whitespace

* Update ggml/src/ggml-sycl/ggml-sycl.cpp

Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@intel.com>

---------

Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@intel.com>
2025-06-02 10:12:20 +01:00
..
ggml-blas cmake : Fix broken CMake error messages (ggml/1252) 2025-06-01 13:43:57 +03:00
ggml-cann CANN: Add SOC TYPE printing in cmake configuration (#13837) 2025-05-28 11:54:20 +08:00
ggml-cpu threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling (#12995) 2025-05-31 15:39:19 -07:00
ggml-cuda CUDA: add a prop in ggml_cuda_device_infor for distinguish iGPU or dGPU in cuda (#13856) (#13895) 2025-05-31 08:48:04 +02:00
ggml-hip CUDA/HIP: Share the same unified memory allocation logic. (#12934) 2025-04-15 11:20:38 +02:00
ggml-kompute llama : add Qwen2VL support + multimodal RoPE (#10361) 2024-12-14 14:43:46 +02:00
ggml-metal ggml : add ggml_gelu_erf() (#13667) 2025-05-21 16:26:33 +02:00
ggml-musa musa: Upgrade MUSA SDK version to rc4.0.1 and use mudnn::Unary::IDENTITY op to accelerate D2D memory copy (#13647) 2025-05-21 09:58:49 +08:00
ggml-opencl opencl: add new ops - argsort, div, sub, addrows, sigmoid, group_norm (#13787) 2025-05-27 12:56:08 -07:00
ggml-rpc rpc : add rpc_msg_set_tensor_hash_req (#13353) 2025-05-09 10:31:07 +03:00
ggml-sycl sycl: quantize and reorder the input to q8_1 when reorder is enabled (#13826) 2025-06-02 10:12:20 +01:00
ggml-vulkan vulkan : Remove unexpected ; (ggml/1253) 2025-06-01 13:43:57 +03:00
CMakeLists.txt ggml : install dynamic backends (ggml/1240) 2025-06-01 13:43:57 +03:00
ggml-alloc.c ggml: Don't assert fail when tensor data changes (#13222) 2025-05-01 22:46:10 +02:00
ggml-backend-impl.h ggml : upgrade init_tensor API to return a ggml_status (#11854) 2025-02-28 14:41:47 +01:00
ggml-backend-reg.cpp ggml-backend : fix backend search path (#12330) 2025-03-11 14:25:17 +01:00
ggml-backend.cpp sched : avoid changing cur_copy when a graph is already allocated (#13922) 2025-05-30 18:56:19 +02:00
ggml-common.h musa: fix all warnings, re-enable -DLLAMA_FATAL_WARNINGS=ON in ci and update doc (#12611) 2025-03-30 10:59:38 +02:00
ggml-impl.h ggml : Print backtrace on uncaught C++ exceptions (ggml/1232) 2025-06-01 13:43:57 +03:00
ggml-opt.cpp mnist: fix segmentation fault (ggml/1227) 2025-05-19 13:29:56 +03:00
ggml-quants.c whisper: remove MSVC warnings pragmas (whisper/3090) 2025-05-07 17:28:36 +03:00
ggml-quants.h ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
ggml-threading.cpp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
ggml-threading.h remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797) 2024-12-12 19:02:49 +01:00
ggml.c sync : whisper.cpp (ggml/1250) 2025-06-01 13:43:57 +03:00
ggml.cpp ggml : Print backtrace on uncaught C++ exceptions (ggml/1232) 2025-06-01 13:43:57 +03:00
gguf.cpp gguf: fix failure on version == 0 (#13956) 2025-06-01 18:08:05 +02:00