convert : fix gemma v1 tokenizer convert (#8248)

ggml-ci
This commit is contained in:
Georgi Gerganov 2024-07-04 10:41:03 +03:00 committed by GitHub
parent f619024764
commit 20fc3804bf
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
28 changed files with 85 additions and 4 deletions

View file

@ -91,6 +91,10 @@ __ggml_vocab_test__
__ggml_vocab_test__
333333333
__ggml_vocab_test__
Cửa Việt
__ggml_vocab_test__
discards
__ggml_vocab_test__
@ -104,5 +108,3 @@ __ggml_vocab_test__
🚀 (normal) 😶‍🌫️ (multiple emojis concatenated) ✅ 🦙🦙 3 33 333 3333 33333 333333 3333333 33333333 3.3 3..3 3...3 កាន់តែពិសេសអាច😁 ?我想在apple工作1314151天 ------======= нещо на Български ''''''```````""""......!!!!!!?????? I've been 'told he's there, 'RE you sure? 'M not sure I'll make it, 'D you like some tea? We'Ve a'lL
__ggml_vocab_test__
Việt
__ggml_vocab_test__