DeepSeek V3: Can Free and Open-Source AI Chatbot Beat...
What if I told you there is a new AI chatbot that outperforms almost every model in the AI space and is also free and open source? Yes, DeepSeek V3 is exactly that. In the Aider LLM Leaderboard, DeepSeek V3 is currently in second place, dethroning GPT-4o, Claude 3.5 Sonnet, and even the newly announced Gemini 2.0. It comes second only to the o1 reasoning model, which takes minutes to generate a result.
So, is it finally time to switch to an open-source AI model? Should we stop our Gemini and ChatGPT subscriptions? In this article, we will explore my experience with DeepSeek V3 and see how well it stacks up against the top players.
Developed by DeepSeek
Developed by the Chinese AI firm DeepSeek, DeepSeek V3 utilizes a transformer-based architecture. Specifically, it employs a Mixture-of-Experts (MoE) transformer where different parts of the model specialize in different tasks, making the model highly efficient. DeepSeek makes all its AI models open source and DeepSeek V3 is the first open-source AI model that surpassed even closed-source models in its benchmarks, especially in code and math aspects.
The only downside to the model as of now is that it is not a multi-modal AI model and can only work on text inputs and outputs. A multi-modal AI chatbot can work with data in different formats like text, image, audio, and even video. While the option to upload images is available on the website, it can only extract text from images.
Most AI companies do not disclose this data to protect their interests as they are for-profit models. However, DeepSeek V3 is well in line with the estimated specs of other models. The best part is DeepSeek trained their V3 model with just $5.5 million compared to OpenAI’s $100 Million investment (mentioned by Sam Altman). So let’s compare DeepSeek with other models in real-world usage.
Comparison with Other Models
I compared the DeepSeek V3 model with GPT 4o and Gemini 1.5 Pro model (Gemini 2.0 is still in beta) with various prompts. All the models are very advanced and can easily generate good text templates like emails or fetch information from the web and display however you want, for example. In this test, we tried to compare their reasoning and understanding capabilities.
1. I started with this prompt: Surprisingly, both ChatGPT and DeepSeek got the answer wrong. While DeepSeek concludes by saying just flip 7 hourglass two times and count 1 more minute, ChatGPT got confused and then concluded you can measure 15 minutes with the above logic. Only Gemini was able to answer this even though we are using an old Gemini 1.5 model. Winner: Gemini
2. I then asked another logic-based question: This problem is harder to solve than it seems. However, Gemini and ChatGPT gave the correct answer directly. Whereas DeepSeek gave a 200-line answer with a detailed explanation. But when I asked for an explanation, both ChatGPT and Gemini explained it in 10-20 lines at max. In the end, all the models answered the query, but DeepSeek explained the complete process step-by-step in a way that’s easier to follow. However, if you prefer to just skim through the process, Gemini and ChatGPT are quicker to follow. Winner: DeepSeek
3. Finally, I asked all the models to create a flow chart: This is an unfair comparison as DeepSeek can only work with text as of now. Creating a flow chart with images and documents is not possible. While the result is hard to comprehend, the logic holds true. Only ChatGPT was able to generate a perfect flow chart as asked. Gemini simply pulled a flow chart image from the internet that shows how to create flow charts instead of Wi-Fi troubleshooting issues. Then it proceeded to give me written steps instead of a flow chart. But when I asked for a flowchart again, it created a text-based flowchart as Gemini cannot work on images with the current stable model. Winner: ChatGPT
Also Read: DeepSeek seems to be on par with the other leading AI models in logical capabilities. The company also claims it solves the needle in a haystack issue, meaning if you have given a large prompt, the AI model will not forget a few details in between. They say it will take all the details into account without fail. Note that these are early stages and the sample size is too small. We will continue testing and poking this new AI model for more results and keep you updated. Stay tuned for more.