llama.cpp

History

yuiseki 5d5c066de8 mtmd : fix Pixtral OOM with large images by capping image_size to 1024 (#14326 ) Mistral Small 2506 models using Pixtral vision encoder were running out of GPU memory when processing images larger than 1024x1024 pixels due to exponential memory growth from unlimited image size. This fix applies the same 1024x1024 limit used by Qwen2VL models to prevent OOM issues while maintaining compatibility with existing models.		2025-06-22 14:44:57 +02:00
..
batched-bench	llama : deprecate llama_kv_self_ API (#14030 )	2025-06-06 14:11:15 +03:00
cvector-generator	llama : deprecate llama_kv_self_ API (#14030 )	2025-06-06 14:11:15 +03:00
export-lora	llama : move end-user examples to tools directory (#13249 )	2025-05-02 20:27:13 +02:00
gguf-split	llama : move end-user examples to tools directory (#13249 )	2025-05-02 20:27:13 +02:00
imatrix	llama : deprecate llama_kv_self_ API (#14030 )	2025-06-06 14:11:15 +03:00
llama-bench	llama-bench : add --no-warmup flag (#14224 ) (#14270 )	2025-06-19 12:24:12 +02:00
main	llama : deprecate llama_kv_self_ API (#14030 )	2025-06-06 14:11:15 +03:00
mtmd	mtmd : fix Pixtral OOM with large images by capping image_size to 1024 (#14326 )	2025-06-22 14:44:57 +02:00
perplexity	llama : deprecate llama_kv_self_ API (#14030 )	2025-06-06 14:11:15 +03:00
quantize	quantize : improve tensor-type pattern matching (#13033 )	2025-05-13 19:12:31 +02:00
rpc	rpc : Fix build on OpenBSD (#13541 )	2025-05-25 15:35:53 +03:00
run	llama : deprecate llama_kv_self_ API (#14030 )	2025-06-06 14:11:15 +03:00
server	llama : improve sep token handling (#14272 )	2025-06-20 14:04:09 +02:00
tokenize	llama : move end-user examples to tools directory (#13249 )	2025-05-02 20:27:13 +02:00
tts	sync : vendor (#13901 )	2025-05-30 16:25:45 +03:00
CMakeLists.txt	mtmd : rename llava directory to mtmd (#13311 )	2025-05-05 16:02:55 +02:00