StableVicuna - Hayo AI Tools
Introducing StableVicuna - the first ever large-scale open source chatbot trained via reinforced learning from human feedback (RHLF). It is a further instruction fine-tuned and RLHF trained version of Vicuna v0 13b, which is an instruction fine-tuned LLaMA 13b model.
We have developed this AI tool to provide a high-performance chatbot that can be utilized by various businesses to communicate with their users seamlessly. With StableVicuna, you can ensure that your users get the best experience when connecting with your business.
How StableVicuna Works
In order to achieve StableVicuna’s strong performance, we utilize Vicuna as the base model and follow a three-stage RLHF pipeline. Concretely, we further train the base Vicuna model with supervised fine-tuning (SFT) using a mixture of three datasets, and use trlx to train a reward model that is first initialized from our further SFT model on the following RLHF preference datasets. Finally, we use trlX to perform Proximal Policy Optimization (PPO) reinforcement learning to perform RLHF training of the SFT model to arrive at StableVicuna.
Features of StableVicuna
- Large-scale open source chatbot
- Trained via reinforced learning from human feedback (RHLF)
- Further instruction fine-tuned and RLHF trained version of Vicuna v0 13b
- Provides seamless communication experience between businesses and users
How to Obtain StableVicuna
StableVicuna is available on the HuggingFace Hub. The model is downloadable as a weight delta against the original LLaMA model. Please note that you also need to have access to the original LLaMA model, which requires you to apply for LLaMA weights separately using the link provided in the GitHub repo. Once you have both the weight delta and the LLaMA weights, you can use a script provided in the GitHub repo to combine them and obtain StableVicuna-13B.
Announcing Our Upcoming Chatbot Interface
Alongside our chatbot, we are excited to preview our upcoming chat interface which is in the final stages of development. The following screenshots offer a glimpse of what users can expect.
Our Commitment to Continuous Improvement
We are committed to continuously improving StableVicuna. Over the coming weeks, we will be iterating on this chatbot and deploying a Discord bot to the Stable Foundation server. We encourage you to try StableVicuna and provide us with valuable feedback to help us improve the user experience. For the time being, you can try the model on a HuggingFace space by visiting this link.
Get the best communication experience for your business - try StableVicuna today!