llama.cpp

History

Christian Kastner 532802f938 Implement GGML_CPU_ALL_VARIANTS for ARM (#14080 ) * ggml-cpu: Factor out feature detection build from x86 * ggml-cpu: Add ARM feature detection and scoring This is analogous to cpu-feats-x86.cpp. However, to detect compile-time activation of features, we rely on GGML_USE_<FEAT> which need to be set in cmake, instead of GGML_<FEAT> that users would set for x86. This is because on ARM, users specify features with GGML_CPU_ARM_ARCH, rather than with individual flags. * ggml-cpu: Implement GGML_CPU_ALL_VARIANTS for ARM Like x86, however to pass around arch flags within cmake, we use GGML_INTERNAL_<FEAT> as we don't have GGML_<FEAT>. Some features are optional, so we may need to build multiple backends per arch version (armv8.2_1, armv8.2_2, ...), and let the scoring function sort out which one can be used. * ggml-cpu: Limit ARM GGML_CPU_ALL_VARIANTS to Linux for now The other platforms will need their own specific variants. This also fixes the bug that the the variant-building branch was always being executed as the else-branch of GGML_NATIVE=OFF. The branch is moved to an elseif-branch which restores the previous behavior.		2025-06-11 21:07:44 +02:00
..
amx	ggml-cpu : split arch-specific implementations (#13892 )	2025-06-09 16:47:13 +02:00
arch	Implement GGML_CPU_ALL_VARIANTS for ARM (#14080 )	2025-06-11 21:07:44 +02:00
cmake	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
kleidiai	ggml-cpu : split arch-specific implementations (#13892 )	2025-06-09 16:47:13 +02:00
llamafile	ggml : Enable MMA for BF16 in llamafile_sgemm (#13148 )	2025-05-02 19:53:12 +03:00
binary-ops.cpp	cpu: de-duplicate some of the operators and refactor (ggml/1144)	2025-03-30 08:33:31 +03:00
binary-ops.h	cpu: de-duplicate some of the operators and refactor (ggml/1144)	2025-03-30 08:33:31 +03:00
CMakeLists.txt	Implement GGML_CPU_ALL_VARIANTS for ARM (#14080 )	2025-06-11 21:07:44 +02:00
common.h	ggml-cpu : split arch-specific implementations (#13892 )	2025-06-09 16:47:13 +02:00
ggml-cpu-impl.h	ggml : fix weak alias win32 (whisper/0)	2025-06-10 18:39:33 +03:00
ggml-cpu.c	ggml-cpu : split arch-specific implementations (#13892 )	2025-06-09 16:47:13 +02:00
ggml-cpu.cpp	ggml-cpu : split arch-specific implementations (#13892 )	2025-06-09 16:47:13 +02:00
hbm.cpp	ggml-cpu : split arch-specific implementations (#13892 )	2025-06-09 16:47:13 +02:00
hbm.h	ggml-cpu : split arch-specific implementations (#13892 )	2025-06-09 16:47:13 +02:00
ops.cpp	releases : use dl backend for linux release, remove arm64 linux release (#13996 )	2025-06-04 13:15:54 +02:00
ops.h	ggml : Depthwise 2D convolution (ggml/1152)	2025-04-24 17:32:47 +03:00
quants.c	ggml-cpu : split arch-specific implementations (#13892 )	2025-06-09 16:47:13 +02:00
quants.h	ggml-cpu : split arch-specific implementations (#13892 )	2025-06-09 16:47:13 +02:00
repack.cpp	ggml-cpu : split arch-specific implementations (#13892 )	2025-06-09 16:47:13 +02:00
repack.h	ggml-cpu : split arch-specific implementations (#13892 )	2025-06-09 16:47:13 +02:00
simd-mappings.h	ggml: aarch64: Implement SVE F32 kernels for vector functions (#13843 )	2025-05-29 09:01:33 +03:00
traits.cpp	ggml-cpu : split arch-specific implementations (#13892 )	2025-06-09 16:47:13 +02:00
traits.h	ggml-cpu : split arch-specific implementations (#13892 )	2025-06-09 16:47:13 +02:00
unary-ops.cpp	cpu: de-duplicate some of the operators and refactor (ggml/1144)	2025-03-30 08:33:31 +03:00
unary-ops.h	cpu: de-duplicate some of the operators and refactor (ggml/1144)	2025-03-30 08:33:31 +03:00
vec.cpp	ggml: aarch64: Implement SVE F32 kernels for vector functions (#13843 )	2025-05-29 09:01:33 +03:00
vec.h	ggml: aarch64: Implement SVE F32 kernels for Mamba Sequential Scan Algorithm (#13882 )	2025-05-29 12:18:43 +03:00