llama.cpp/docs/backend
Bizhao Shi 2d38b6e400
CANN: Add the basic supports of Flash Attention kernel (#13627)
* cann: add the basic FA support

* cann: update the readme

* cann: update the FlashAttention with PSEShift

* cann: update the input parameters in FA

* cann: update the alibi with max_bias

* cann: add the constrints of softcap

* cann: update the docs CANN.md

* cann: update the docs CANN.md

* cann: fix typo of CANN.md

* cann: add some comments and update the CANN.md

* cann: update the CANN.md

* cann: update the inner precise for fusedInferAttention

* cann: update the constraints of flash_attn_ext on ggml-cann.cpp

* cann: clean the whitespace

* cann: clean the whitespace

* cann: add a new endline
2025-05-26 10:20:18 +08:00
..
BLIS.md make : deprecate (#10514) 2024-12-02 21:22:53 +02:00
CANN.md CANN: Add the basic supports of Flash Attention kernel (#13627) 2025-05-26 10:20:18 +08:00
CUDA-FEDORA.md docs: update: improve the Fedoa CUDA guide (#12536) 2025-03-24 11:02:26 +00:00
OPENCL.md opencl: update doc for OpenCL (#12702) 2025-04-03 22:18:17 -07:00
SYCL.md sycl : backend documentation review (#13544) 2025-05-19 14:38:20 +01:00