llama.cpp/ggml
Georgi Gerganov 2f74c354c0
graph : make FA compatible with MLA + add initial Metal kernels (#12953)
* graph : make mla compatible with FA

* metal : add exp FA kernels for DeepSeek models

ggml-ci

* llama : minor naming updates

ggml-ci

* ggml : disable FA for DS head sizes

* tests : add FA tests for MLA shapes

ggml-ci
2025-04-17 18:16:36 +03:00
..
cmake scripts : update sync + fix cmake merge 2025-03-27 10:09:29 +02:00
include ggml : add bilinear upscale support (ggml/1185) 2025-04-11 00:17:47 +03:00
src graph : make FA compatible with MLA + add initial Metal kernels (#12953) 2025-04-17 18:16:36 +03:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt CUDA/HIP: Share the same unified memory allocation logic. (#12934) 2025-04-15 11:20:38 +02:00