A Look at Open-Source Alternatives to ChatGPT
ChatGPT has been gaining popularity since its release in November. However, it has also sparked an AI arms race between tech giants, making the industry less open and more competitive. Closed LLMs, such as ChatGPT, Bard, and Claude, have many advantages, but they also limit research labs and scientists who want to study and better understand LLMs. They can also be inconvenient for organizations that want to create and run their own models.
Fortunately, there is a community effort to create open-source models that match the performance of state-of-the-art LLMs. One of the most important open-source language models comes from FAIR, Meta's AI research lab. In February, FAIR released LLaMA, a family of LLMs that come in four different sizes: 7, 13, 33, and 65 billion parameters.
FAIR researchers trained LLaMA on more tokens, making it easier to retrain and fine-tune for specific tasks and use cases. This has made it possible for other researchers to fine-tune the model for ChatGPT-like performance through techniques such as reinforcement learning from human feedback (RLHF). Meta released the model under a non-commercial license focused on research use cases.
In March, researchers at Stanford released Alpaca, an instruction-following LLM based on LLaMA 7B. They fine-tuned the model using a technique called self-instruct, where an LLM generates instruction, input, and output samples to fine-tune itself. According to their preliminary experiments, Alpaca’s performance is very similar to InstructGPT. The researchers make the self-instruct data set, the details of the data generation process, and the code for generating the data and fine-tuning the model available to the public. However, Alpaca is only intended for academic research.
Researchers at UC Berkeley, Carnegie Mellon University, Stanford, and UC San Diego released Vicuna, another instruction-following LLM based on LLaMA. Vicuna comes in two sizes, 7 billion and 13 billion parameters. They fine-tuned Vicuna using the training code from Alpaca and 70,000 examples from ShareGPT. Preliminary evaluations show that Vicuna outperforms LLaMA and Alpaca, and it is also very close to Bard and ChatGPT-4. The researchers released the model weights along with a full framework to install, train, and run LLMs. There is also an online demo where you can test and compare Vicuna with other open-source instruction LLMs.
In March, Databricks released Dolly, a fine-tuned version of EleutherAI’s GPT-J 6B. The researchers were inspired by the work done by the teams behind LLaMA and Alpaca. Training Dolly cost less than $30 and took 30 minutes on a single machine. Databricks later released Dolly 2.0, a 12-billion parameter model based on EleutherAI's pythia model. They fine-tuned the model on a 15,000-example dataset instruction-following examples generated fully by humans. Databricks released the trained Dolly 2 model, which has none of the limitations of the previous models and can be used for commercial purposes.
Overall, open-source alternatives to ChatGPT offer researchers and organizations more flexibility and accessibility to sophisticated technology. With more research and development, these models have the potential to compete with commercial LLMs and prevent a few wealthy organizations from having too much control over the LLM market.