llama.cpp/tools
Georgi Gerganov 6562e5a4d6
context : allow cache-less context for embeddings (#13108)
* context : allow cache-less context for embeddings

ggml-ci

* context : enable reranking with encode()

ggml-ci

* context : encode() clears embd_seq

ggml-ci

* examples : use llama_encode() when appropriate

ggml-ci

* models : nomic bert moe does not require KV cache

* llama : update comments for llama_decode/llama_encode

ggml-ci

* context : update warning log [no ci]
2025-05-08 14:28:33 +03:00
..
batched-bench llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
cvector-generator llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
export-lora llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
gguf-split llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
imatrix context : remove logits_all flag (#13284) 2025-05-08 14:26:50 +03:00
llama-bench llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
main context : remove logits_all flag (#13284) 2025-05-08 14:26:50 +03:00
mtmd clip : refactor graph builder (#13321) 2025-05-06 22:40:24 +02:00
perplexity context : remove logits_all flag (#13284) 2025-05-08 14:26:50 +03:00
quantize llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
rpc rpc : use backend registry, support dl backends (#13304) 2025-05-04 21:25:43 +02:00
run llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
server context : allow cache-less context for embeddings (#13108) 2025-05-08 14:28:33 +03:00
tokenize llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
tts llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
CMakeLists.txt mtmd : rename llava directory to mtmd (#13311) 2025-05-05 16:02:55 +02:00