mtmd : add support for Qwen2-Audio and SeaLLM-Audio (#13760)

* mtmd : add Qwen2-Audio support * small clean up * update discussion link * clarify mtmd_get_output_embd * clarification in multimodal.md * fix ultravox bug * ggml_cont
2025-05-25 14:06:32 +02:00 · 2025-05-25 14:06:32 +02:00 · 40aaa8a403
commit 40aaa8a403
parent a08c1d2845
9 changed files with 144 additions and 52 deletions
--- a/tools/mtmd/mtmd.h
+++ b/tools/mtmd/mtmd.h
@ -203,6 +203,8 @@ MTMD_API int32_t mtmd_encode_chunk(mtmd_context * ctx,
                                   const mtmd_input_chunk * chunk);

 // get output embeddings from the last encode pass
+// the reading size (in bytes) is equal to:
+// llama_model_n_embd(model) * mtmd_input_chunk_get_n_tokens(chunk) * sizeof(float)
 MTMD_API float * mtmd_get_output_embd(mtmd_context * ctx);

 /////////////////////////////////////////