China's DeepSeek releases open AI model that beats OpenAI's...
Join Our Free Masterclass on the Power of AI & Bitcoin. Watch Now!
DeepSeek V3, released Wednesday under a permissive commercial license, was trained on 14.8 trillion tokens using Nvidia H800 GPUs.
The company reports spending just US$5.5 million on training — notably less than comparable models.
The model features 671 billion parameters (685B on Hugging Face), making it approximately 1.6 times larger than Meta’s Llama 3.1 405B.
In benchmark tests, DeepSeek claims the model outperforms both open and closed AI models, including Meta’s Llama and OpenAI’s GPT-4, particularly in coding competitions on platforms like Codeforces.
While the model’s size suggests superior capabilities, it requires significant computing power to run at reasonable speeds. The company completed training in about two months, using Nvidia H800 GPUs — hardware that Chinese companies are now restricted from acquiring under US Commerce Department rules.
Despite all the computing power, however, independent testers note that the model appears to have content restrictions around political topics that are sensitive in China.
Stay informed by subscribing to our newsletter, delivering the latest updates in AI, Web3 and exponential technologies straight to your inbox!
Accessibility Toolbar