llama.cpp

History

Ewan Crawford 6b56a64690 SYCL: Avoid using with SYCL-Graph for unsupported nodes (#13587 ) Currently on a CUDA backend to SYCL when running `GGML_SYCL_DISABLE_GRAPH=0 ./bin/test-backend-ops -b SYCL0` there are two operations that throw an exception from the blocking waits during queue recording. * `-o CONCAT` : Use of blocking waits on a queue that's being recorded https://github.com/ggml-org/llama.cpp/blob/master/ggml/src/ggml-sycl/concat.cpp#L185-L187 * `-o MUL_MAT_ID`: Blocking wait on a recording queue for a copy to host memory https://github.com/ggml-org/llama.cpp/blob/master/ggml/src/ggml-sycl/ggml-sycl.cpp#L3072-L3074 We've noticed that `ggml-cuda.cu` has the [check_node_graph_compatibility_and_refresh_copy_ops](`39e73ae0d6/ggml/src/ggml-cuda/ggml-cuda.cu (L2458-L2458)`) method for checking if a graph can be used, even if enabled. I've taken a similar approach in this PR by adding a method to `ggml-sycl.cpp` for checking if a graph can be used for the operations even if a user has asked for it to be enabled.		2025-05-22 16:24:09 +08:00
..
cmake	scripts : update sync + fix cmake merge	2025-03-27 10:09:29 +02:00
include	ggml : add ggml_gelu_erf() (#13667 )	2025-05-21 16:26:33 +02:00
src	SYCL: Avoid using with SYCL-Graph for unsupported nodes (#13587 )	2025-05-22 16:24:09 +08:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	sycl: use oneDNN for matrices multiplication (#12972 )	2025-05-15 16:53:41 +02:00