llama.cpp

History

Xuan-Son Nguyen 33eff40240 server : vision support via libmtmd (#12898 ) * server : (experimental) vision support via libmtmd * mtmd : add more api around mtmd_image_tokens * mtmd : add more api around mtmd_image_tokens * mtmd : ability to calc image hash * shared_ptr for mtmd_image_tokens * move hash to user-define ID (fixed) * abstract out the batch management * small fix * refactor logic adding tokens to batch * implement hashing image * use FNV hash, now hash bitmap instead of file data * allow decoding image embedding to be split into batches * rm whitespace * disable some features when mtmd is on * fix --no-mmproj-offload * mtmd_context_params no timings * refactor server_inp to server_tokens * fix the failing test case * init * wip * working version * add mtmd::bitmaps * add test target * rm redundant define * test: mtmd_input_chunks_free * rm outdated comment * fix merging issue * explicitly create mtmd::input_chunks * mtmd_input_chunk_copy * add clone() * improve server_input struct * clip : fix confused naming ffn_up and ffn_down * rm ffn_i/o/g naming * rename n_embd, n_ff * small fix * no check n_ff * fix detokenize * add const to various places * add warning about breaking changes * add c api * helper: use mtmd_image_tokens_get_n_pos * fix ctx_shift * fix name shadowing * more strict condition * support remote image_url * remote image_url log * add CI test * do not log base64 * add "has_multimodal" to /props * remove dangling image * speculative: use slot.cache_tokens.insert * Apply suggestions from code review Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * rm can_be_detokenized * on prmpt processing done, assert cache_tokens.size * handle_completions_impl returns void * adapt the new web ui * update docs and hot topics * rm assert * small fix (2) --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>		2025-05-09 19:29:37 +02:00
..
batched-bench	llama : move end-user examples to tools directory (#13249 )	2025-05-02 20:27:13 +02:00
cvector-generator	llama : move end-user examples to tools directory (#13249 )	2025-05-02 20:27:13 +02:00
export-lora	llama : move end-user examples to tools directory (#13249 )	2025-05-02 20:27:13 +02:00
gguf-split	llama : move end-user examples to tools directory (#13249 )	2025-05-02 20:27:13 +02:00
imatrix	imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation (#13389 )	2025-05-09 11:53:58 +02:00
llama-bench	llama : move end-user examples to tools directory (#13249 )	2025-05-02 20:27:13 +02:00
main	llama : do not crash if there is no CPU backend (#13395 )	2025-05-09 13:02:07 +02:00
mtmd	server : vision support via libmtmd (#12898 )	2025-05-09 19:29:37 +02:00
perplexity	context : remove logits_all flag (#13284 )	2025-05-08 14:26:50 +03:00
quantize	llama : move end-user examples to tools directory (#13249 )	2025-05-02 20:27:13 +02:00
rpc	llama : do not crash if there is no CPU backend (#13395 )	2025-05-09 13:02:07 +02:00
run	llama-run: add support for downloading models from ModelScope (#13370 )	2025-05-09 10:25:50 +01:00
server	server : vision support via libmtmd (#12898 )	2025-05-09 19:29:37 +02:00
tokenize	llama : move end-user examples to tools directory (#13249 )	2025-05-02 20:27:13 +02:00
tts	llama : move end-user examples to tools directory (#13249 )	2025-05-02 20:27:13 +02:00
CMakeLists.txt	mtmd : rename llava directory to mtmd (#13311 )	2025-05-05 16:02:55 +02:00