llama.cpp/include
Georgi Gerganov 6562e5a4d6
context : allow cache-less context for embeddings (#13108)
* context : allow cache-less context for embeddings

ggml-ci

* context : enable reranking with encode()

ggml-ci

* context : encode() clears embd_seq

ggml-ci

* examples : use llama_encode() when appropriate

ggml-ci

* models : nomic bert moe does not require KV cache

* llama : update comments for llama_decode/llama_encode

ggml-ci

* context : update warning log [no ci]
2025-05-08 14:28:33 +03:00
..
llama-cpp.h llama : add llama_vocab, functions -> methods, naming (#11110) 2025-01-12 11:32:42 +02:00
llama.h context : allow cache-less context for embeddings (#13108) 2025-05-08 14:28:33 +03:00