llama.cpp/models
Olivier Chafik 669912d9a5
tool-call: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034)
* sampler: turn lazy grammar trigger words to regexes

* add scripts/tool_bench.sh & .py

* constrain llama json output regardless of function name if matches at beginning

* update relaxed newline space rule in grammar tests

* support add_generation_prompt query parameter (useful for /apply_template)

* Update src/llama-grammar.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-03-05 13:05:13 +00:00
..
templates tool-call: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034) 2025-03-05 13:05:13 +00:00
.editorconfig gguf : new file format with flexible meta data (beta) (#2398) 2023-08-21 23:07:43 +03:00
ggml-vocab-aquila.gguf Work on the BPE tokenizer (#3252) 2023-10-03 09:16:26 +02:00
ggml-vocab-baichuan.gguf Add more tokenizer tests (#3742) 2023-10-24 09:17:17 +02:00
ggml-vocab-bert-bge.gguf llama : fix BPE pre-tokenization (#6920) 2024-04-29 16:58:41 +03:00
ggml-vocab-bert-bge.gguf.inp Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-bert-bge.gguf.out Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-chameleon.gguf.inp llama : add support for Chameleon (#8543) 2024-09-28 15:08:43 +03:00
ggml-vocab-chameleon.gguf.out llama : add support for Chameleon (#8543) 2024-09-28 15:08:43 +03:00
ggml-vocab-command-r.gguf command-r : add BPE pre-tokenization (#7063) 2024-05-05 08:19:30 +03:00
ggml-vocab-command-r.gguf.inp Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-command-r.gguf.out Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-deepseek-coder.gguf llama : fix BPE pre-tokenization (#6920) 2024-04-29 16:58:41 +03:00
ggml-vocab-deepseek-coder.gguf.inp Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-deepseek-coder.gguf.out Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-deepseek-llm.gguf llama : fix BPE pre-tokenization (#6920) 2024-04-29 16:58:41 +03:00
ggml-vocab-deepseek-llm.gguf.inp Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-deepseek-llm.gguf.out Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-deepseek-r1-qwen.gguf.inp llama : add support for Deepseek-R1-Qwen distill model (#11310) 2025-01-20 14:35:07 +01:00
ggml-vocab-deepseek-r1-qwen.gguf.out llama : add support for Deepseek-R1-Qwen distill model (#11310) 2025-01-20 14:35:07 +01:00
ggml-vocab-falcon.gguf llama : fix BPE pre-tokenization (#6920) 2024-04-29 16:58:41 +03:00
ggml-vocab-falcon.gguf.inp Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-falcon.gguf.out Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-gpt-2.gguf llama : fix BPE pre-tokenization (#6920) 2024-04-29 16:58:41 +03:00
ggml-vocab-gpt-2.gguf.inp Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-gpt-2.gguf.out Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-gpt-4o.gguf.inp llama : add Phi-4-mini support (supersede #12099) (#12108) 2025-02-28 12:44:11 +01:00
ggml-vocab-gpt-4o.gguf.out llama : add Phi-4-mini support (supersede #12099) (#12108) 2025-02-28 12:44:11 +01:00
ggml-vocab-gpt-neox.gguf Add more tokenizer tests (#3742) 2023-10-24 09:17:17 +02:00
ggml-vocab-llama-bpe.gguf llama : fix BPE pre-tokenization (#6920) 2024-04-29 16:58:41 +03:00
ggml-vocab-llama-bpe.gguf.inp Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-llama-bpe.gguf.out Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-llama-spm.gguf llama : fix BPE pre-tokenization (#6920) 2024-04-29 16:58:41 +03:00
ggml-vocab-llama-spm.gguf.inp Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-llama-spm.gguf.out Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-mpt.gguf llama : fix BPE pre-tokenization (#6920) 2024-04-29 16:58:41 +03:00
ggml-vocab-mpt.gguf.inp Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-mpt.gguf.out Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-phi-3.gguf Per token attributes (#7685) 2024-06-04 09:17:17 +02:00
ggml-vocab-phi-3.gguf.inp Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-phi-3.gguf.out Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-qwen2.gguf llama : add BPE pre-tokenization for Qwen2 (#7114) 2024-05-08 15:06:43 +03:00
ggml-vocab-qwen2.gguf.inp Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-qwen2.gguf.out Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-refact.gguf tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-refact.gguf.inp Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-refact.gguf.out Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-roberta-bpe.gguf.inp convert : add support for Roberta embeddings (#10695) 2024-12-07 09:02:14 +02:00
ggml-vocab-roberta-bpe.gguf.out convert : add support for Roberta embeddings (#10695) 2024-12-07 09:02:14 +02:00
ggml-vocab-starcoder.gguf llama : fix BPE pre-tokenization (#6920) 2024-04-29 16:58:41 +03:00
ggml-vocab-starcoder.gguf.inp Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00
ggml-vocab-starcoder.gguf.out Inference support for T5 and FLAN-T5 model families (#5763) 2024-07-04 15:46:11 +02:00