The Rise of DeepSeek: A Threat to OpenAI's Dominance?

OpenAI says DeepSeek 'inappropriately' copied ChatGPT – but it's...

Until a few weeks ago, few people in the Western world had heard of a small Chinese artificial intelligence (AI) company known as DeepSeek. But on January 20, it captured global attention when it released a new AI model called R1. R1 is a “reasoning” model, meaning it works through tasks step by step and details its working process to a user. It is a more advanced version of DeepSeek’s V3 model, which was released in December. DeepSeek’s new offering is almost as powerful as rival company OpenAI’s most advanced AI model o1, but at a fraction of the cost.

Within days, DeepSeek’s app surpassed ChatGPT in new downloads and set stock prices of tech companies in the United States tumbling. It also led OpenAI to claim that its Chinese rival had effectively pilfered some of the crown jewels from OpenAI’s models to build its own.

Allegations of Model Distillation

In a statement to the New York Times, the company said: We are aware of and reviewing indications that DeepSeek may have inappropriately distilled our models, and will share information as we know more. We take aggressive, proactive countermeasures to protect our technology and will continue working closely with the US government to protect the most capable models being built here. The Conversation approached DeepSeek for comment, but it did not respond.

But even if DeepSeek copied – or, in scientific parlance, “distilled” – at least some of ChatGPT to build R1, it’s worth remembering that OpenAI also stands accused of disrespecting intellectual property while developing its models.

Understanding Model Distillation

Model distillation is a common machine learning technique in which a smaller “student model” is trained on predictions of a larger and more complex “teacher model”. When completed, the student may be nearly as good as the teacher but will represent the teacher’s knowledge more effectively and compactly. To do so, it is not necessary to access the inner workings of the teacher. All one needs to pull off this trick is to ask the teacher model enough questions to train the student.

DeepSeek or ChatGPT: A price-to-performance comparison

This is what OpenAI claims DeepSeek has done: queried OpenAI’s o1 at a massive scale and used the observed outputs to train DeepSeek’s own, more efficient models.

Competitive Landscape

DeepSeek claims that both the training and usage of R1 required only a fraction of the resources needed to develop their competitors’ best models. There are reasons to be sceptical of some of the company’s marketing hype – for example, a new independent report suggests the hardware spend on R1 was as high as USD 500 million. But even so, DeepSeek was still built very quickly and efficiently compared with rival models.

How Generative AI Generates Legal Issues in the Games Industry

There are other reasons that help explain DeepSeek’s success, such as the company’s deep and challenging technical work. The technical advances made by DeepSeek included taking advantage of less powerful but cheaper AI chips (also called graphical processing units, or GPUs).

Legal Disputes

OpenAI’s terms of use explicitly state nobody may use its AI models to develop competing products. However, its own models are trained on massive datasets scraped from the web. These datasets contained a substantial amount of copyrighted material, which OpenAI says it is entitled to use on the basis of “fair use”.

This argument will be tested in court. Newspapers, musicians, authors and other creatives have filed a series of lawsuits against OpenAI on the grounds of copyright infringement. The war of words and lawsuits is an artefact of how the rapid advance of AI has outpaced the development of clear legal rules for the industry.

Future of AI

DeepSeek has shown it is possible to develop state-of-the-art models cheaply and efficiently. Whether they can compete with OpenAI on a level playing field remains to be seen. Over the weekend, OpenAI attempted to demonstrate its supremacy by publicly releasing its most advanced consumer model, o3-mini.

These developments herald an era of increased choice for consumers, with a diversity of AI models on the market. This is good news for users: competitive pressures will make models cheaper to use. As these models become more ubiquitous, we all benefit from improvements to their efficiency.

DeepSeek’s rise certainly marks new territory for building models more cheaply and efficiently. Perhaps it will also shake up the global conversation on how AI companies should collect and use their training data.