The truth behind China's big model price war


Whether the API business model can be established does not depend on the price.

On the morning of May 21, Alibaba Cloud accidentally released the news of a big price cut at its regular summit: the reasoning input price of the main GPT-4 model of Tongyi Qianwen dropped to 0.5 yuan/million tokens, down 97%.

 Alibaba Cloud Price Reduction Announcement | Source: Alibaba Cloud Alibaba Cloud Price Reduction Announcement | Source: Alibaba Cloud

The news of the sharp price reduction instantly aroused widespread concern and discussion with the industry, but a chain reaction occurred in just a few hours. The main models of Baidu Intelligent Cloud's official Wenxin model, ERNIE Speed (context length 8K, 128K) and ERNIE Lite (context length 8K, 128K), are free of charge. Although these two models do not exactly correspond to the model that Ali announced the price reduction, this action has led many people to start shouting about the API business model of China's big model enterprises. Could it be that the API business model disappeared within three hours?

 Wenxin big model, two main models, free of charge Wenxin big model, two main models, free of charge

In fact, earlier, byte beating announced that the reasoning input price of the general model of bean curd was 0.8 yuan/million tokens.

Although there are many different details behind these prices in terms of concurrency and model capabilities, even according to the actual calculations of entrepreneurs, when they are actually used in business, the cost reduction of each company is not as exaggerated as the publicity.

But at least on paper, byte, Alibaba and Baidu announced 0.8 yuan, 0.5 yuan and free quotes for every million token reasoning inputs in one week. Some people in the industry even joked that they might have to pay subsidies to let customers use APIs when they see the next follow-up manufacturer. What is the reason for such drastic changes in such a short time? This is the inner volume of a marketing promotion? Is the business model of API calls of the big model really resolved in this way?

01 Behind the price war, the big model Business model of API call

In fact, when ChatGPT was first released, people had great expectations for the business model where the big model was directly called as a service. After all, compared with the last wave of AI chimney project delivery, the big model has brought more general AI capabilities. It makes sense to call it as a standardized service.

Take OpenAI for example, there are two major commercial means, one is the member subscription mode like ChatGPT Plus $20/month, and the other is the developer API call service. Driven by these two standardized services, On December 31, 2023, The information revealed that OpenAI's annual recurring revenue (ARR) had reached 1.6 billion dollars.

But even if it is better than OpenAI's model capability, the revenue of this volume is still a drop in the bucket compared with its R&D cost of 10 billion dollars.

 Photo source: Visual China Photo source: Visual China

In fact, only the model API is provided, which is a long way from the landing of AI applications in the scene. Most AI applications also need to optimize the model engine by feeding data and making fine adjustments in the scene on top of a common model API. After seeing this bottleneck, domestic large model manufacturers have also made a series of explorations in the past year to reduce the threshold of AI applications, with a view to expanding the call volume of large models.

Take Baidu Intelligent Cloud as an example. Last year, model development tool Model Builder and AI application development tool AppBuilder were launched successively, and different models with more cost effectiveness were introduced. However, the growth of model calls seems to be limited. In April this year, Baidu Smart Cloud launched an ecological game, working with suppliers with channels and scenes to serve customers, aiming to further increase the standardized API usage of Wenxin Model. From these signs, it is obvious that the big factories are not really ready to give up the business model of API calls, but this business model has not really received the scale of money, which is a real problem.

Last week, Baidu's latest data showed that Wenxin's big model processes 250 billion tokens of text every day, and another big factory processes 120 billion tokens of text every day, but a large part of them are internal businesses of big factories calling for AI applications and business exploration.

It can be seen that although model manufacturers have made many attempts, the standardized model API has not ushered in a deterministic growth.

This is the major premise and background for the price reduction of the big factory model reasoning this week. Having understood the current demand for model API calls, it is not difficult to understand this wave of price cuts - price cuts have not really lost much revenue, so it is better to activate the market, earn a shout, promote many enterprises to start launching from the "free trial", and start the attempt to enter the business flow of AI early.

In fact, the real source of the chain reaction of this wave of price cuts is not Alibaba Cloud, nor is it byte beating. It was a startup company that cut prices earlier than big factories.

On May 6, "Deep Quest", a large model company under Magic Square, a domestic startup company, opened the second generation MoE model, DeepSeek-V2, with more main parameters, stronger capabilities and lower costs.

Because DeepSeek's technical advantages have been widely praised in the global big model circle, and on the premise that its model capability is close to the first echelon closed source model, it has also reduced the reasoning cost to 1 yuan/million token, that is, the cost is one seventh of Llama370B and one seventieth of GPT-4 Turbo. In addition, DeepSeek v2 can also be profitable, which is obviously the cost reduction brought about by a series of progress in model architecture, system and engineering.

This information has caused extensive discussion in the industry where the model is actually applied, and has also caused considerable fluctuations overseas. SemiAnayisis, an independent research institution of semiconductor and artificial intelligence, said that its performance is close to the first echelon represented by the big model GPT-4, and the reasoning price is quite low, which is a force that cannot be underestimated in China.

After DeepSeek v2 announced that its price was 1 yuan/million token, there was a big model price war. SmartSpectrum, Facewall, Byte, Alibaba, Baidu, as well as iFLYTEK and Tencent Cloud, which followed today, announced the reduction of model reasoning price.

For different price reduction strategies, there have been some doubts that the throughput of some price reduction models is low, while the price of high-performance models is not reduced. In addition, there are many details that will make the final use of enterprises less cheap than advertised. From this perspective, the price reduction is more an internal roll of model manufacturers for market and brand considerations.

In the final analysis, it can form a chain reaction of price war, and it is also the ability of the current major models in the available scenarios, which has not yet opened the gap. Users even have free and open source products to use.

A founder of SaaS manufacturers said to Geek Park, "It doesn't matter who I use, because they all run the same way in the end. When looking at the time line, 99.9% probability of the model API services provided by these manufacturers is similar to today's cloud. In addition, if a general model API cannot deeply adapt to the scenario, it is still necessary to do special in-depth training with scenario data based on the open source model, and it will not connect to the model API. "

The final customer demand is actually an end-to-end demand, something that can be used and can see the effect, rather than model invocation.

02 Big model, giant and Startups have different games

Of course, the price reduction of model reasoning is also an inevitable result of technological development. There are a series of projects, architectures, and system means that can be continuously optimized. At the Microsoft Build developer conference this morning, Nadella also gave an example to explain this trend. He said that in the past year, the performance of GPT-4 has increased by 6 times, but the cost has decreased to 1/12 of the previous one, and the corresponding performance/cost has increased by 70 times.

 Photo source: Visual China Photo source: Visual China

"The difficulty is to explore the upper limit of model capability. As for the price of model reasoning, there must be a way to reduce it." Yan Junjie, the founder of Minimax, expressed this technical trend in the live broadcast program of Geek Park last week. He said that the price of model reasoning has been reduced to usable, which has occurred three times in the academic community. This is not difficult.

The rise of model technology is the premise for the continued growth of API calling business model. In fact, the same is true for the model products that announced the price reduction today. The model reasoning that is really large-scale, high-performance and supports high concurrency still needs to be charged, and the price reduction is limited.

But in the long run, the ultimate test of the API model is the ability of the model. If the technology cannot open the gap, the price must also not open the gap. The value of the final model call will be diluted. Although it is still an important infrastructure, the value will change from oil to water.

From another perspective, a common model API may not be an urgent need today. As Jia Yangqing, the founder of, expressed in his circle of friends, "From the perspective of the entire AI industry, I would like to say that price reduction is a simple strategy that can be done at the head of the head, but the real commercial success of To B is more difficult." When enterprises use AI today, it is not cost driven, "Today, it is not because API is expensive, but because we need to figure out how to use it to generate business value.".

From this point of view, how to promote the large model capability to a large part of the enterprise's business tasks may fall back to the hands of traditional SaaS manufacturers (after upgrading products with AI), which need to be delivered to various scenarios as "trunk logistics"+"front-end warehouse" of intelligent productivity.

As the model API direct supply mode is highly involved, the giant is actually looking at SaaS that can deliver value. Today, Microsoft announced that GitHub Copilot subscribers have 1.8 million paying users. Google is also talking with the CRM marketing giant Hubspot about the possibility of acquisition of up to 30 billion dollars recently, because the former may use this acquisition to strengthen its product integration in the AI field.

For giants, both model technology and actual scenarios need to be focused to achieve large-scale revenue. But in the final analysis, it is the challenge of "Huashan One Way" that the API model can generate value to distance itself from others.

For large model start-ups, it is also a "Huashan two way" situation, that is, either make better model technology than large companies, or move from models to products to directly create value.

The ability of intelligence will not be free, but giants and start-ups are still looking for answers to how to create value on a large scale.

 Sina Technology Official Account
Sina Technology Official Account

"Palm" technology news (WeChat search techsina or scan the QR code on the left to follow)

Record of creation

Scientific exploration

Science Masters

Apple Exchange

Mass testing


Official microblog

 Sina Technology  Sina Digital  Sina mobile phone  Scientific exploration  Apple Exchange  Sina public survey

Public account

Sina Technology

Sina Technology Brings You the Fresh Technology Information

Apple Exchange

Apple Exchange brings you the latest Apple product news

Sina public survey

Try new cool products for free at the first time

Sina Exploration

Provide the latest scientist news and wonderful shocking pictures