llama.cpp/common
Xuan-Son Nguyen 33eff40240
server : vision support via libmtmd (#12898)
* server : (experimental) vision support via libmtmd

* mtmd : add more api around mtmd_image_tokens

* mtmd : add more api around mtmd_image_tokens

* mtmd : ability to calc image hash

* shared_ptr for mtmd_image_tokens

* move hash to user-define ID (fixed)

* abstract out the batch management

* small fix

* refactor logic adding tokens to batch

* implement hashing image

* use FNV hash, now hash bitmap instead of file data

* allow decoding image embedding to be split into batches

* rm whitespace

* disable some features when mtmd is on

* fix --no-mmproj-offload

* mtmd_context_params no timings

* refactor server_inp to server_tokens

* fix the failing test case

* init

* wip

* working version

* add mtmd::bitmaps

* add test target

* rm redundant define

* test: mtmd_input_chunks_free

* rm outdated comment

* fix merging issue

* explicitly create mtmd::input_chunks

* mtmd_input_chunk_copy

* add clone()

* improve server_input struct

* clip :  fix confused naming ffn_up and ffn_down

* rm ffn_i/o/g naming

* rename n_embd, n_ff

* small fix

* no check n_ff

* fix detokenize

* add const to various places

* add warning about breaking changes

* add c api

* helper: use mtmd_image_tokens_get_n_pos

* fix ctx_shift

* fix name shadowing

* more strict condition

* support remote image_url

* remote image_url log

* add CI test

* do not log base64

* add "has_multimodal" to /props

* remove dangling image

* speculative: use slot.cache_tokens.insert

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* rm can_be_detokenized

* on prmpt processing done, assert cache_tokens.size

* handle_completions_impl returns void

* adapt the new web ui

* update docs and hot topics

* rm assert

* small fix (2)

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-05-09 19:29:37 +02:00
..
cmake llama : reorganize source code + improve CMake (#8006) 2024-06-26 18:33:02 +03:00
minja sync: minja (#12739) 2025-04-04 21:16:39 +01:00
arg.cpp server : vision support via libmtmd (#12898) 2025-05-09 19:29:37 +02:00
arg.h common : add common_remote_get_content (#13123) 2025-04-26 22:58:12 +02:00
base64.hpp llava : expose as a shared library for downstream projects (#3613) 2023-11-07 00:36:23 +03:00
build-info.cpp.in build : link against build info instead of compiling against it (#3879) 2023-11-02 08:50:16 +02:00
chat.cpp server : (webui) revamp the input area, plus many small UI improvements (#13365) 2025-05-08 15:37:29 +02:00
chat.h server: extract <think> tags from qwq outputs (#12297) 2025-03-10 10:59:03 +00:00
CMakeLists.txt ci : limit write permission to only the release step + fixes (#13392) 2025-05-08 23:45:22 +02:00
common.cpp context : remove logits_all flag (#13284) 2025-05-08 14:26:50 +03:00
common.h imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation (#13389) 2025-05-09 11:53:58 +02:00
console.cpp console : utf-8 fix for windows stdin (#9690) 2024-09-30 11:23:42 +03:00
console.h gguf : new file format with flexible meta data (beta) (#2398) 2023-08-21 23:07:43 +03:00
json-schema-to-grammar.cpp grammar : handle maxItems == 0 in JSON schema (#13117) 2025-04-26 10:10:20 +02:00
json-schema-to-grammar.h tool-call: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034) 2025-03-05 13:05:13 +00:00
json.hpp json-schema-to-grammar improvements (+ added to server) (#5978) 2024-03-21 11:50:43 +00:00
llguidance.cpp upgrade to llguidance 0.7.10 (#12576) 2025-03-26 11:06:09 -07:00
log.cpp Fix: Compile failure due to Microsoft STL breaking change (#11836) 2025-02-12 21:36:11 +01:00
log.h cleanup: fix compile warnings associated with gnu_printf (#11811) 2025-02-12 10:06:53 -04:00
ngram-cache.cpp ggml : portability fixes for VS 2017 (#12150) 2025-03-04 18:53:26 +02:00
ngram-cache.h llama : use LLAMA_TOKEN_NULL (#11062) 2025-01-06 10:52:15 +02:00
sampling.cpp common : Add a warning when we can't match samplers from a string or char. (#13330) 2025-05-07 11:23:28 +03:00
sampling.h sampling : support for llguidance grammars (#10224) 2025-02-02 09:55:32 +02:00
speculative.cpp llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
speculative.h speculative : update default params (#11954) 2025-02-19 13:29:42 +02:00
stb_image.h common : Update stb_image.h to latest version (#9161) 2024-08-27 08:58:50 +03:00