Breaking Down DeepSeek's Revolutionary LLM Model

Published On Thu Jan 02 2025
Breaking Down DeepSeek's Revolutionary LLM Model

Chinese AI tech charts course beyond ChatGPT - Chinadaily.com.cn

Chinese startup DeepSeek has made a significant mark in the generative AI landscape with the groundbreaking release of its latest large-scale language model (LLM), which is comparable to leading models from heavyweights like OpenAI. DeepSeek's V3 model, trained for just two months using significantly fewer computing resources, delivers performance on par with the world's top proprietary model, GPT-4o, at a much lower cost than its rivals, according to the Hangzhou-based firm.

Generative AI Landscape: Today's Trends and Beyond

DeepSeek-V3 makes it "look easy today with an open weights release of a frontier-grade LLM trained on a joke of a budget (2,048 GPUs for 2 months, $6M)", posted Andrej Karpathy, a founding member of OpenAI, on social media platform X. "This is such clean engineering under resource constraints…all of this looks so elegant," said Tim Dettmers, a research scientist at Allen AI, on X.

Shifting AI Ambitions in China

DeepSeek's new open-source tool exemplifies a shift in China's AI ambitions, signaling that merely catching up to ChatGPT is no longer the goal. Instead, Chinese tech firms are now focused on delivering more affordable and versatile AI services.

According to a 2024 report by the China Internet Network Information Center, 230 million users in China had registered for using generative AI-powered products as of June 2024.

Innovative AI Applications

In November, the Beijing-based AI startup Shengshu Technology unveiled its image-to-video tool called Vidu-1.5, capable of generating a video from as few as three input images within 30 seconds while establishing logical relationships among those objects in a scene.

Sony's Venom: The Last Dance, screened in China in October, was accompanied by an elegant Chinese ink-style promotional video crafted by Vidu.

Here's what the most innovative AI applications have in common

Using traditional film methods to produce a 30-second trailer typically takes about 30 days, but with Vidu, it only takes 10 working days and saves nearly 90 percent on post-production costs, said Zhang Xudong, product director of Shengshu Technology.

AI design can also inspire artists, offering new creative ideas beyond expectations, Zhang added.

In March, Wang Feng and his team at East China Normal University unveiled a million-word AI-generated fantasy novel, Heavenly Mandate Apostle, crafted with a homegrown large language model.

US is losing AI edge to China, experts tell lawmakers - Defense One

The Xingye chatbot, developed by Shanghai-based startup MiniMax, uses AI to enable users to interact with a virtual romantic partner.

Global Recognition

Xingye's international version, Talkie, along with another Chinese AI companion app, Poly.ai, ranked among the top 10 most downloaded AI apps in the United States during the first half of 2024.

Alibaba's Tongyi LLM, specializing in digital avatar tech, has recently gained internet fame with its "All-People's Stage" feature.

However, the misuse of talking head technology has led to a surge in fake content, such as rewriting the classic Dream of the Red Chamber as a martial arts story.

Last month, China's broadcasting watchdog issued new rules to strengthen oversight, highlighting the country's commitment to closely monitoring the rapid growth of AI.