kv-cache : refactor the update/defrag mechanism (#13988)

* kv-cache : refactor update mechanism

ggml-ci

* memory : improve status handling

* defrag : reset head + add comments

ggml-ci

* cont : minor fixes

ggml-ci
This commit is contained in:
Georgi Gerganov 2025-06-04 18:58:20 +03:00 committed by GitHub
parent 2589ad3704
commit 3e63a58ef7
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
11 changed files with 340 additions and 191 deletions

View file

@ -52,9 +52,7 @@ public:
llama_memory_state_ptr init_full() override;
bool update(llama_context & lctx) override;
void defrag_sched(float thold) override;
llama_memory_state_ptr init_update(llama_context * lctx, bool optimize) override;
bool prepare(const std::vector<llama_ubatch> & ubatches);