Teuken 7BOpenGPT-X
|
mT5Google
|
|||||
Related Products
|
||||||
About
Teuken-7B is a multilingual, open source language model developed under the OpenGPT-X initiative, specifically designed to cater to Europe's diverse linguistic landscape. It has been trained on a dataset comprising over 50% non-English texts, encompassing all 24 official languages of the European Union, ensuring robust performance across these languages. A key innovation in Teuken-7B is its custom multilingual tokenizer, optimized for European languages, which enhances training efficiency and reduces inference costs compared to standard monolingual tokenizers. The model is available in two versions, Teuken-7B-Base, the foundational pre-trained model, and Teuken-7B-Instruct, which has undergone instruction tuning for improved performance in following user prompts. Both versions are accessible on Hugging Face, promoting transparency and collaboration within the AI community. The development of Teuken-7B underscores a commitment to creating AI models that reflect Europe's diversity.
|
About
Multilingual T5 (mT5) is a massively multilingual pretrained text-to-text transformer model, trained following a similar recipe as T5. This repo can be used to reproduce the experiments in the mT5 paper.
mT5 is pretrained on the mC4 corpus, covering 101 languages:
Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Chinese, Corsican, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Kurdish, Kyrgyz, Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Norwegian, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Scottish Gaelic, Serbian, Shona, Sindhi, and more.
|
|||||
Platforms Supported
Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook
|
Platforms Supported
Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook
|
|||||
Audience
Researchers wanting a multilingual language model solution to streamline their natural language processing tasks
|
Audience
Developers interested in a multilingual large language transformer model
|
|||||
Support
Phone Support
24/7 Live Support
Online
|
Support
Phone Support
24/7 Live Support
Online
|
|||||
API
Offers API
|
API
Offers API
|
|||||
Screenshots and Videos |
Screenshots and Videos |
|||||
Pricing
Free
Free Version
Free Trial
|
Pricing
Free
Free Version
Free Trial
|
|||||
Reviews/
|
Reviews/
|
|||||
Training
Documentation
Webinars
Live Online
In Person
|
Training
Documentation
Webinars
Live Online
In Person
|
|||||
Company InformationOpenGPT-X
Founded: 2022
Germany
opengpt-x.de/en/models/teuken-7b/
|
Company InformationGoogle
Founded: 1998
United States
github.com/google-research/multilingual-t5
|
|||||
Alternatives |
Alternatives |
|||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Categories |
Categories |
|||||
|
|
|