Multilingual and open source: OpenGPT-X research project releases ...
The large language model of the OpenGPT-X research project is now available for download on Hugging Face. "Teuken-7B" has been trained from scratch in all 24 official languages of the European Union and contains 7 billion parameters. Researchers and companies can leverage this commercially usable open source model for their own artificial intelligence applications.
The OpenGPT-X Consortium
The OpenGPT-X consortium, led by the Fraunhofer Institutes for Intelligent Analysis and Information Systems IAIS and for Integrated Circuits IIS, have developed an AI language model that is open source and has a distinctly European perspective.
Model Features and Benefits
In the OpenGPT-X project, two years were spent researching the underlying technologies for large AI foundation models and training models with leading industry and research partners. The "Teuken-7B" model is freely available, providing a public, research-based alternative for use in academia and industry.
Tokenizer Development
In addition to model training, the OpenGPT-X team addressed research questions related to training and operating multilingual AI language models more efficiently. The project developed a multilingual "tokenizer" to optimize model performance across multiple languages.
Industry Applications and Innovations
The technology developed in OpenGPT-X will provide a basis for training own models in the future, enabling companies to create customized AI solutions for various applications without relying on third-party components.
Collaborative Efforts and Future Outlook
The collaborative efforts of the consortium partners have led to valuable foundational technology in the OpenGPT-X project. The research project, which began in 2022, is set to conclude in March 2025, allowing for further optimizations and evaluations of the models.
Interested developers can download Teuken-7B free of charge from Hugging Face for research or commercial purposes. The model has been optimized for chat applications through "instruction tuning," enhancing its usability in practical scenarios.