From a08c1d2845dc279d58d826ef3c3ecad97cbbcef7 Mon Sep 17 00:00:00 2001 From: ddpasa <112642920+ddpasa@users.noreply.github.com> Date: Sun, 25 May 2025 14:04:49 +0200 Subject: [PATCH] docs : add Moondream2 pre-quantized link (#13745) * Multimodal: Added Moondream2 model and fixed ggml.org link * Apply suggestions from code review --------- Co-authored-by: name Co-authored-by: Xuan-Son Nguyen --- docs/multimodal.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/docs/multimodal.md b/docs/multimodal.md index ffcbbd77..2f3c416b 100644 --- a/docs/multimodal.md +++ b/docs/multimodal.md @@ -33,7 +33,7 @@ llama-server -hf ggml-org/gemma-3-4b-it-GGUF --no-mmproj-offload ## Pre-quantized models -These are ready-to-use models, most of them come with `Q4_K_M` quantization by default. They can be found at the Hugging Face page of the ggml-org: https://huggingface.co/ggml-org +These are ready-to-use models, most of them come with `Q4_K_M` quantization by default. They can be found at the Hugging Face page of the ggml-org: https://huggingface.co/collections/ggml-org/multimodal-ggufs-68244e01ff1f39e5bebeeedc Replaces the `(tool_name)` with the name of binary you want to use. For example, `llama-mtmd-cli` or `llama-server` @@ -81,6 +81,10 @@ NOTE: some models may require large context window, for example: `-c 8192` # Llama 4 Scout (tool_name) -hf ggml-org/Llama-4-Scout-17B-16E-Instruct-GGUF + +# Moondream2 20250414 version +(tool_name) -hf ggml-org/moondream2-20250414-GGUF + ``` **Audio models**: