![]() * ggml : FA supports F32 V * graph : cast KV to F16 when the KV cache is not used ggml-ci * server : add test that exercises embeddings with FA enabled ggml-ci |
||
---|---|---|
.. | ||
test_basic.py | ||
test_chat_completion.py | ||
test_completion.py | ||
test_ctx_shift.py | ||
test_embedding.py | ||
test_infill.py | ||
test_lora.py | ||
test_rerank.py | ||
test_security.py | ||
test_slot_save.py | ||
test_speculative.py | ||
test_tokenize.py | ||
test_tool_call.py |