llama.cpp

History

Radoslav Gerganov 553a5c3a9f rpc : do not wait for response when sending RPC_CMD_SET_TENSOR (#12943 ) RPC_CMD_SET_TENSOR always returns an empty response and we send this 4 times per token. We can improve TG speed if we don't wait for this empty response. The performance impact of this change depends on the network latency.		2025-04-25 10:08:08 +03:00
..
cmake	scripts : update sync + fix cmake merge	2025-03-27 10:09:29 +02:00
include	rpc : do not wait for response when sending RPC_CMD_SET_TENSOR (#12943 )	2025-04-25 10:08:08 +03:00
src	rpc : do not wait for response when sending RPC_CMD_SET_TENSOR (#12943 )	2025-04-25 10:08:08 +03:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	ggml : add SSE 4.2 and x64 base variant for CPUs without AVX (#12871 )	2025-04-21 18:13:51 +02:00