Inference support for T5 and FLAN-T5 model families (#5763)

* llama : add inference support and model types for T5 and FLAN-T5 model families

* llama : add new API functions to support encoder-decoder models: llama_encode(), llama_model_has_encoder(), llama_model_decoder_start_token()

* common, llama-cli, llama-batched : add support for encoder-decoder models

* convert-hf : handle shared token embeddings tensors in T5Model

* convert-hf : add support for SentencePiece BPE tokenizer in T5Model (for Pile-T5 models)

* convert-hf : add MT5ForConditionalGeneration and UMT5ForConditionalGeneration to architectures supported by T5Model

* convert : add t5 tokenizer tests, use "slow" HF tokenizer for t5

---------

Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

This commit is contained in:

fairydreaming

2024-07-04 15:46:11 +02:00

• committed by

GitHub

parent f8c4c0738d

commit 807b0c49ff

No known key found for this signature in database

GPG key ID: B5690EEEBB952194

33 changed files with 946 additions and 31 deletions

1

models/ggml-vocab-deepseek-llm.gguf.out

View file

 @ -31,6 +31,7 @@
 403
 2906
 11 320 6 436 0 1724 418 340 33701 210 3025 19017 612 9407 2681 16 18 16 19 16 20 16 1398 68940 239
 3033
 
 18
 18 18

Rows
Columns

Inference support for T5 and FLAN-T5 model families (#5763)

1 models/ggml-vocab-deepseek-llm.gguf.out Unescape Escape View file

1

models/ggml-vocab-deepseek-llm.gguf.out

View file