Inference support for T5 and FLAN-T5 model families (#5763)

* llama : add inference support and model types for T5 and FLAN-T5 model families

* llama : add new API functions to support encoder-decoder models: llama_encode(), llama_model_has_encoder(), llama_model_decoder_start_token()

* common, llama-cli, llama-batched : add support for encoder-decoder models

* convert-hf : handle shared token embeddings tensors in T5Model

* convert-hf : add support for SentencePiece BPE tokenizer in T5Model (for Pile-T5 models)

* convert-hf : add MT5ForConditionalGeneration and UMT5ForConditionalGeneration to architectures supported by T5Model

* convert : add t5 tokenizer tests, use "slow" HF tokenizer for t5

---------

Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

This commit is contained in:

fairydreaming

2024-07-04 15:46:11 +02:00

• committed by

GitHub

parent f8c4c0738d

commit 807b0c49ff

No known key found for this signature in database

GPG key ID: B5690EEEBB952194

33 changed files with 946 additions and 31 deletions

1

models/ggml-vocab-starcoder.gguf.out

View file

 @ -31,6 +31,7 @@
 299
 34719
 49 553 44 483 38 4998 904 863 18445 247 1037 4995 13379 2924 9515 17823 54 56 54 57 54 58 54 11904 47892
 3226
 
 56
 56 56

Rows
Columns

Inference support for T5 and FLAN-T5 model families (#5763)

1 models/ggml-vocab-starcoder.gguf.out Unescape Escape View file

1

models/ggml-vocab-starcoder.gguf.out

View file