sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs (#12858)

* sycl : Implemented reorder Q4_0 mmvq

Signed-off-by: Alberto Cabrera <alberto.cabrera@codeplay.com>

* sycl : Fixed mmvq being called when reorder is disabled

* sycl : Improved comments in the quants header

Signed-off-by: Alberto Cabrera <alberto.cabrera@codeplay.com>

* Use static_assert

* safe_div -> ceil_div

* Clarify qi comment

* change the reorder tensor from init to execute OP

* dbg

* Undo changes to test-backend-ops

* Refactor changes on top of q4_0 reorder fix

* Missing Reverts

* Refactored opt_for_reorder logic to simplify code path

* Explicit inlining and unroll

* Renamed mul_mat_algo enum for consistency

---------

Signed-off-by: Alberto Cabrera <alberto.cabrera@codeplay.com>
Co-authored-by: romain.biessy <romain.biessy@codeplay.com>
This commit is contained in:
Alberto Cabrera Pérez 2025-05-09 16:34:08 +01:00 committed by GitHub
parent 611aa914ef
commit 17512a94d6
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 385 additions and 136 deletions

View file

@ -14,23 +14,24 @@
#define GGML_SYCL_BACKEND_HPP
#include "binbcast.hpp"
#include "concat.hpp"
#include "common.hpp"
#include "concat.hpp"
#include "conv.hpp"
#include "convert.hpp"
#include "cpy.hpp"
#include "dequantize.hpp"
#include "dmmv.hpp"
#include "element_wise.hpp"
#include "gla.hpp"
#include "im2col.hpp"
#include "mmq.hpp"
#include "mmvq.hpp"
#include "rope.hpp"
#include "norm.hpp"
#include "outprod.hpp"
#include "quants.hpp"
#include "rope.hpp"
#include "softmax.hpp"
#include "tsembd.hpp"
#include "im2col.hpp"
#include "wkv.hpp"
#include "outprod.hpp"
#include "element_wise.hpp"
#include "cpy.hpp"
#include "gla.hpp"
#endif // GGML_SYCL_BACKEND_HPP
#endif // GGML_SYCL_BACKEND_HPP