* llama : use n_swa + n_ubatch cells for SWA cache ggml-ci * llama : add warning about multi-sqeuence SWA contexts |
||
|---|---|---|
| .. | ||
| llama-cpp.h | ||
| llama.h | ||
* llama : use n_swa + n_ubatch cells for SWA cache ggml-ci * llama : add warning about multi-sqeuence SWA contexts |
||
|---|---|---|
| .. | ||
| llama-cpp.h | ||
| llama.h | ||