llama.cpp/ggml
Sigbjørn Skjæret 36ca8b3628 CUDA: don't convert BF16 weights to FP32 (ggml/1174)
* add bf16 support

* use convert_from_bf16_cuda instead of convert_unary_cuda for f32

* revert 7ec5085

* move functionality into convert_unary with constexpr
2025-04-07 18:44:17 +03:00
..
cmake scripts : update sync + fix cmake merge 2025-03-27 10:09:29 +02:00
include metal : improve FA + improve MoE (#12612) 2025-03-28 20:21:59 +02:00
src CUDA: don't convert BF16 weights to FP32 (ggml/1174) 2025-04-07 18:44:17 +03:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt ggml : add logging for native build options/vars (whisper/2935) 2025-03-30 08:33:31 +03:00