CANN: Add the basic supports of Flash Attention kernel (#13627)
* cann: add the basic FA support * cann: update the readme * cann: update the FlashAttention with PSEShift * cann: update the input parameters in FA * cann: update the alibi with max_bias * cann: add the constrints of softcap * cann: update the docs CANN.md * cann: update the docs CANN.md * cann: fix typo of CANN.md * cann: add some comments and update the CANN.md * cann: update the CANN.md * cann: update the inner precise for fusedInferAttention * cann: update the constraints of flash_attn_ext on ggml-cann.cpp * cann: clean the whitespace * cann: clean the whitespace * cann: add a new endline
This commit is contained in:
parent
e121edc432
commit
2d38b6e400
9 changed files with 392 additions and 0 deletions
2
ggml/src/ggml-cann/acl_tensor.cpp
Normal file → Executable file
2
ggml/src/ggml-cann/acl_tensor.cpp
Normal file → Executable file
|
@ -31,6 +31,8 @@ aclDataType ggml_cann_type_mapping(ggml_type type) {
|
|||
return ACL_FLOAT;
|
||||
case GGML_TYPE_F16:
|
||||
return ACL_FLOAT16;
|
||||
case GGML_TYPE_BF16:
|
||||
return ACL_BF16;
|
||||
case GGML_TYPE_I8:
|
||||
return ACL_INT8;
|
||||
case GGML_TYPE_I16:
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue