DeepSeek-R1: Rivaling Giants in AI Innovation

Published On Wed Jan 29 2025
DeepSeek-R1: Rivaling Giants in AI Innovation

A look at DeepSeek: China's open-source AI research lab which ...

Chinese AI research lab DeepSeek grabbed global attention last week with the release of its open-source AI model, DeepSeek-R1. The company says the model rivals industry giants like OpenAI in critical areas such as mathematical reasoning, code generation, and cost efficiency, signalling a shift in the global AI landscape.

Emergence of DeepSeek

DeepSeek is an artificial intelligence research lab which emerged from Fire-Flyer, a deep-learning branch of High-Flyer, a Chinese quantitative hedge fund. Established in 2015, High-Flyer gained prominence by leveraging advanced computing to analyse financial data. By 2023, its founder, Liang Wenfeng, redirected resources towards creating DeepSeek, aspiring to develop groundbreaking AI models.

DeepSeek's Unique Approach

Unlike most Chinese AI firms, DeepSeek operates independently of major tech giants such as Baidu and Alibaba. Liang’s motivation for this ambitious venture was rooted in scientific curiosity rather than immediate financial returns. “Basic science research rarely offers high returns on investment,” he remarked.

DeepSeek-AI Releases DeepSeek-R1-Zero and DeepSeek-R1: First ...

DeepSeek-R1 Model

DeepSeek-R1 is an advanced reasoning model that claims to surpass existing benchmarks on several critical tasks. The model and its variants, such as DeepSeek-R1-Zero, employ large-scale reinforcement learning (RL) techniques and multi-stage training to achieve their capabilities.

Open-Sourcing by DeepSeek

DeepSeek has also taken a notable step by open-sourcing not just its flagship models but also six smaller distilled variants, ranging from 1.5 billion to 70 billion parameters. These models are MIT-licensed, enabling researchers and developers to freely distil, fine-tune, and commercialise their work.

Technical Advancements

DeepSeek also advanced technical designs such as multi-head latent attention (MLA) and a mixture of experts, which made its models more cost-effective. The latest DeepSeek model required just one-tenth of the computing power used by Meta’s comparable Llama 3.1 model, according to a report by Epoch AI.

Founder of DeepSeek

Liang Wenfeng, born in 1985, is a Chinese entrepreneur and the founder and CEO of DeepSeek. He is also the co-founder of the quantitative hedge fund High-Flyer. Liang’s educational background includes a Bachelor of Engineering in electronic information engineering and a Master of Engineering in information and communication engineering from Zhejiang University.

In 2016, he co-founded the quantitative investment firm Ningbo High-Flyer, which utilised mathematics and AI for investment strategies. Liang expanded his focus on AI by founding High-Flyer AI in 2019, which specialised in AI algorithms and applications. Through DeepSeek, Liang has positioned himself at the forefront of AI research.