Hugging Face Releases Free ChatGPT Clone: HuggingChat
Hugging Face, an AI community and platform that provides free open source tools for developing machine learning and AI apps, has released a new open source ChatGPT clone called HuggingChat. This clone is free to use, and no registration is required to access it. HuggingChat is based on the Open Assistant Conversational AI Model, which is a project of the non-profit Large-scale Artificial Intelligence Open Network (LAION). LAION is a global non-profit organization dedicated to providing access to cutting-edge technology as open source. The goal of Hugging Face is to democratize machine learning research and its applications in order to have a positive impact on our world.
The HuggingChat ChatGPT clone was created using the same training methodology created by OpenAI known as reinforcement learning from human feedback (RLHF). RLHF is a technique for creating a high-quality human-annotated and quality rated dataset of questions and answers that can be used to train AI to follow directions. With this release, Hugging Face has accomplished their goal of putting the RLHF technique within reach of anyone who wants to train an AI.
The OpenAssistant Conversations Dataset (OASST1) was used to train HuggingChat. This dataset is very new and contains data that was collected up until April 12, 2023. The dataset consists of 161,443 messages distributed across 66,497 conversation trees in 35 different languages and is annotated with 461,292 quality ratings. The dataset is the result of a crowdsourcing effort by over 13,000 volunteers worldwide. However, the crowdsourcing approach introduced limitations in the dataset's quality in the form of cultural and subjective biases of the individuals who created and rated the training data.
The researchers who created the dataset also warned that participants who were more engaged contributed more, creating an uneven distribution of their values and biases. The researchers conclude that the dataset may not represent the diversity of viewpoints across all the contributors. However, they stand behind the dataset because it was created with strict quality guidelines to prevent harmful content and to encourage contributors to generate high-quality responses.
Although HuggingChat is not yet at the ChatGPT level of output, it is still an impressive achievement and an important step for the open source community. The app page lists it as version 0.0, which should give an idea of how mature it is at this point. HuggingChat is open for users right now, and users do not have to create a login account to access it.