R0CKSTAR
|
c35e586ea5
|
musa: enable building fat binaries, enable unified memory, and disable Flash Attention on QY1 (MTT S80) (#9526)
* mtgpu: add mp_21 support
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* mtgpu: disable flash attention on qy1 (MTT S80); disable q3_k and mul_mat_batched_cublas
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* mtgpu: enable unified memory
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* mtgpu: map cublasOperation_t to mublasOperation_t (sync code to latest)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
|
2024-09-22 16:55:49 +02:00 |
|