CANN: Add the basic supports of Flash Attention kernel (#13627)

* cann: add the basic FA support * cann: update the readme * cann: update the FlashAttention with PSEShift * cann: update the input parameters in FA * cann: update the alibi with max_bias * cann: add the constrints of softcap * cann: update the docs CANN.md * cann: update the docs CANN.md * cann: fix typo of CANN.md * cann: add some comments and update the CANN.md * cann: update the CANN.md * cann: update the inner precise for fusedInferAttention * cann: update the constraints of flash_attn_ext on ggml-cann.cpp * cann: clean the whitespace * cann: clean the whitespace * cann: add a new endline
2025-05-26 10:20:18 +08:00 · 2025-05-26 10:20:18 +08:00 · 2d38b6e400
commit 2d38b6e400
parent e121edc432
9 changed files with 392 additions and 0 deletions
--- a/ggml/src/ggml-cann/acl_tensor.cpp
+++ b/ggml/src/ggml-cann/acl_tensor.cpp
@ -31,6 +31,8 @@ aclDataType ggml_cann_type_mapping(ggml_type type) {
            return ACL_FLOAT;
        case GGML_TYPE_F16:
            return ACL_FLOAT16;
+        case GGML_TYPE_BF16:
+            return ACL_BF16;
        case GGML_TYPE_I8:
            return ACL_INT8;
        case GGML_TYPE_I16: