llama.cpp/tools/server/tests/unit
Olivier Chafik f5cd27b71d
server: streaming of tool calls and thoughts when --jinja is on (#12379)
* add common_json w/ support for truncated json healing

* add common_chat_msg_diff

* partial common_chat_parse

* refactor parser w/ optionals

* server: wire chat diffs in stream mode

* fix trigger of thinking models (must happen after thoughts are closed)

* fix functionary v3.2 raw python!

* rename: common_chat_syntax (now contains format)

* rm common_regex.at_start

* don't return empty <think></think>

* accommodate yet another deepseek r1 distill fantasy syntax (`<|tool▁calls|>`)

* fix QwQ 32B tool call parsing after thoughts (hermes2)

* better logs for grammar triggers

* consume spaces after parse_json_tool_calls

* fix required tool calls w/ thinking models that have pre-opened thinking tags

* fix thinking model's initial trigger + test qwq's template

* run most test_tool_call tests in stream + non-stream modes

* make functionary v3.2 parsing more strict (differentiate first match from others)

* send final diff from server, to close off raw python arguments

* support partial content streaming in Generic mode

* tool-call: allow content prelude before hermes2 tool calls (for Qwen2.5)

* Update function-calling.md

* Update tool_bench.py

* chat-parser: remove input from exception (llm output may contain PII)

---------

Co-authored-by: ochafik <ochafik@google.com>
Co-authored-by: Olivier Chafik <ochafik@users.noreply.github.com>
2025-05-25 01:48:08 +01:00
..
test_basic.py llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
test_chat_completion.py server: streaming of tool calls and thoughts when --jinja is on (#12379) 2025-05-25 01:48:08 +01:00
test_completion.py server : fix cache_tokens bug with no cache_prompt (#13533) 2025-05-14 13:35:07 +02:00
test_ctx_shift.py server : do not return error out of context (with ctx shift disabled) (#13577) 2025-05-16 21:50:00 +02:00
test_embedding.py llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
test_infill.py llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
test_lora.py llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
test_rerank.py llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
test_security.py llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
test_slot_save.py llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
test_speculative.py llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
test_template.py server: inject date_string in llama 3.x template + fix date for firefunction v2 (#12802) 2025-05-15 02:39:51 +01:00
test_tokenize.py llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
test_tool_call.py server: streaming of tool calls and thoughts when --jinja is on (#12379) 2025-05-25 01:48:08 +01:00
test_vision_api.py server : support audio input (#13714) 2025-05-23 11:03:47 +02:00