llama.cpp/ggml
Akarshan Biswas 6f180b915c
SYCL: Add non contiguous support in RMS_NORM and NORM kernels (#13611)
* SYCL: Add non contiguous input support to norm kernel

* refactor and add RMS_NORM non contiguous input support

ggml-ci

* restore subgroup reduction for multi-subgroup thread blocks in norm kernels

* Swap grid dims of nsamples and nrows

ggml-ci

* Revert "Swap grid dims of nsamples and nrows"

This reverts commit 43be2d657fec7f7fba54e2cd154106bc0fc45adf.

* restore not required changes
ggml-ci

* address review comments: change it to more like SYCL

* Use a common function to calculate offset

* remove wrap around logic for handling broadcasts

* remove static from calculate_offset fn and use ceil_div
2025-05-26 21:10:36 +05:30
..
cmake scripts : update sync + fix cmake merge 2025-03-27 10:09:29 +02:00
include ggml : fix the order of ggml_unary_op (#13718) 2025-05-23 08:12:48 +02:00
src SYCL: Add non contiguous support in RMS_NORM and NORM kernels (#13611) 2025-05-26 21:10:36 +05:30
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt sycl: use oneDNN for matrices multiplication (#12972) 2025-05-15 16:53:41 +02:00