Commit graph

7 commits

Author SHA1 Message Date
Nicolò Scipione
f7c9429c85
sycl : Overcoming workaround for mmap() allocation on Windows (#13482)
* Remove mmap workaround on windows

After some testing I found that mmap is supported on windows and for
many GPUs on Linux. Therefore I remove the workaround for windows since
it is not necessary.

* Update llama-bench README

SYCL backend introduced a workaround that allows execution of
llama-bench also without specifying `--mmp 0` flag
2025-05-20 08:54:43 +08:00
Diego Devesa
6c8b91500e
llama-bench : fix -ot with dl backends (#13563) 2025-05-15 15:46:55 +02:00
Georgi Gerganov
b2838049cc
bench : handle decode errors (#13548)
ggml-ci
2025-05-15 05:57:02 +03:00
Diego Devesa
cf0a43bb64
llama-bench : add defrag-thold, check for invalid ranges (#13487) 2025-05-13 00:31:37 +02:00
Diego Devesa
22cdab343b
llama-bench : accept ranges for integer parameters (#13410) 2025-05-12 13:08:22 +02:00
David Huang
7f323a589f
Add --no-op-offload to improve -ot pp perf in MoE models like llama4 400B (#13386) 2025-05-11 14:18:39 +02:00
Diego Devesa
1d36b3670b
llama : move end-user examples to tools directory (#13249)
* llama : move end-user examples to tools directory

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-05-02 20:27:13 +02:00