llama.cpp

History

Atharva Dubey 663445b0de sycl: quantize and reorder the input to q8_1 when reorder is enabled (#13826 ) * [WIP]: fuse q8 quantization and reorder * wip2: fuse q8 quantization and reorder * working q8 reorder commit * restored common.hpp * remove debug prints * remove unnecessary headers and remove trailing whitespace * Update ggml/src/ggml-sycl/ggml-sycl.cpp Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@intel.com> --------- Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@intel.com>		2025-06-02 10:12:20 +01:00
..
ggml-blas	cmake : Fix broken CMake error messages (ggml/1252)	2025-06-01 13:43:57 +03:00
ggml-cann	CANN: Add SOC TYPE printing in cmake configuration (#13837 )	2025-05-28 11:54:20 +08:00
ggml-cpu	threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling (#12995 )	2025-05-31 15:39:19 -07:00
ggml-cuda	CUDA: add a prop in ggml_cuda_device_infor for distinguish iGPU or dGPU in cuda (#13856 ) (#13895 )	2025-05-31 08:48:04 +02:00
ggml-hip	CUDA/HIP: Share the same unified memory allocation logic. (#12934 )	2025-04-15 11:20:38 +02:00
ggml-kompute	llama : add Qwen2VL support + multimodal RoPE (#10361 )	2024-12-14 14:43:46 +02:00
ggml-metal	ggml : add ggml_gelu_erf() (#13667 )	2025-05-21 16:26:33 +02:00
ggml-musa	musa: Upgrade MUSA SDK version to rc4.0.1 and use mudnn::Unary::IDENTITY op to accelerate D2D memory copy (#13647 )	2025-05-21 09:58:49 +08:00
ggml-opencl	opencl: add new ops - `argsort`, `div`, `sub`, `addrows`, `sigmoid`, `group_norm` (#13787 )	2025-05-27 12:56:08 -07:00
ggml-rpc	rpc : add rpc_msg_set_tensor_hash_req (#13353 )	2025-05-09 10:31:07 +03:00
ggml-sycl	sycl: quantize and reorder the input to q8_1 when reorder is enabled (#13826 )	2025-06-02 10:12:20 +01:00
ggml-vulkan	vulkan : Remove unexpected ; (ggml/1253)	2025-06-01 13:43:57 +03:00
CMakeLists.txt	ggml : install dynamic backends (ggml/1240)	2025-06-01 13:43:57 +03:00
ggml-alloc.c	ggml: Don't assert fail when tensor data changes (#13222 )	2025-05-01 22:46:10 +02:00
ggml-backend-impl.h	ggml : upgrade init_tensor API to return a ggml_status (#11854 )	2025-02-28 14:41:47 +01:00
ggml-backend-reg.cpp	ggml-backend : fix backend search path (#12330 )	2025-03-11 14:25:17 +01:00
ggml-backend.cpp	sched : avoid changing cur_copy when a graph is already allocated (#13922 )	2025-05-30 18:56:19 +02:00
ggml-common.h	musa: fix all warnings, re-enable `-DLLAMA_FATAL_WARNINGS=ON` in ci and update doc (#12611 )	2025-03-30 10:59:38 +02:00
ggml-impl.h	ggml : Print backtrace on uncaught C++ exceptions (ggml/1232)	2025-06-01 13:43:57 +03:00
ggml-opt.cpp	mnist: fix segmentation fault (ggml/1227)	2025-05-19 13:29:56 +03:00
ggml-quants.c	whisper: remove MSVC warnings pragmas (whisper/3090)	2025-05-07 17:28:36 +03:00
ggml-quants.h	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-threading.cpp	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-threading.h	remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797 )	2024-12-12 19:02:49 +01:00
ggml.c	sync : whisper.cpp (ggml/1250)	2025-06-01 13:43:57 +03:00
ggml.cpp	ggml : Print backtrace on uncaught C++ exceptions (ggml/1232)	2025-06-01 13:43:57 +03:00
gguf.cpp	gguf: fix failure on version == 0 (#13956 )	2025-06-01 18:08:05 +02:00