阿里降价97%、百度宣布免费 AI大模型价格战“杀疯了”丨新经济观察

Cover reporter Meng Mei Ouyang Hongyu

The price war finally reached the AI big model field.

On May 21, Alibaba Cloud first launched a blockbuster: Qwen Long, the main GPT-4 model of Tongyi Qianwen, the API input price dropped from 0.02 yuan/thousand tokens to 0.0005 yuan/thousand tokens, a direct drop of 97%, about 1/400 of the GPT-4 price. Just a few hours later, Baidu immediately took up the fight and announced that the two main models of Wenxin model were free and took effect immediately. In addition to the smart spectrum AI big model that has announced price reduction, and the byte self-developed bean bag big model that takes the low price route, the big model price war that is contested by all the heroes has officially entered the white hot.

AI big model is considered as a cross era tool in the era of artificial intelligence. Since the release of ChatGPT, domestic giants have flocked to the market and started a new round of price competition after the technology accumulation and scene landing of the "first year of the big model". In the view of industry insiders, this round of big model price reduction is that manufacturers hope to further seize the market and accelerate the commercialization of AI applications, which will not only help popularize AI to the public, but also promote the great changes of the industry.

Alibaba Baidu Bytes Competes to Reduce Prices

Big model pricing enters the "Li era"

After this round of price reduction, the big model pricing has really entered the "Li era".

It is reported that Qwen Long is a long text enhanced version of the Tongyi Qianwen model, with performance benchmarking GPT-4 and a maximum context length of 10 million. After this price reduction, the input price dropped to 0.0005 yuan/thousand tokens, that is, 0.5 percent/thousand tokens.

In addition, the output price of Qwen Long also dropped by 90% to 0.002 yuan/thousand tokens. The API input price of Qwen Max, the flagship model of Tongyi Qianwen recently released, dropped to 0.04 yuan/thousand tokens, a 67% drop.

However, since the beginning of May, various companies have begun to compete on the price of AI model.

On May 6, DeepSeek, a subsidiary of Magic Quadrant Quantification, released the second generation MoE model DeepSeek-V2. The API price of this model is 1 yuan per million Tokens input and 2 yuan per million Tokens output (32K context), which is nearly one percent of the price of GPT-4 Turbo. A few days later, the new price system was officially announced by the Zhipu Big Model. The call price of the entry-level GLM-3 Turbo model dropped from 5 yuan/million tokens to 1 yuan/million tokens, a drop of 80%.

Subsequently, the main model of Doubao developed by the cloud service platform volcanic engine under Byte Beat disclosed its pricing,

The reasoning input price of the general model pro-32k is 0.0008 yuan/thousand tokens.

In the afternoon of May 21, when various companies were "in the midst of war", Baidu announced on its public account that the two main models of Wenxin model, ERNIE Speed and ERNIE Lite, were free of charge.

The same "price war" is also being fought abroad. OpenAI released GPT-4o on May 13, which not only greatly surpasses GPT-4 Turbo in function, but also costs only half of its price, which is $5/million tokens. Since the beginning of 2023, OpenAI has also made four price reductions. In the Gemini model series under Google, the price of Gemini 1.5 Flash is 0.35 dollars/million tokens, which is much cheaper than GPT-4o. So far, the price competition between domestic and foreign technology giants around the AI model has been fully launched.

According to OpenAI's expectation, the large model will continue to reduce costs by 50-75% annually. However, at present, the cost reduction rate of the current large model is much faster than expected.

Scale Effect Brings Price Reduction Tide

Or accelerate AI application explosion

In the past, the reasoning cost of large models declined, largely depending on the upgrading of computing power. So, what is the logic behind this round of large model reasoning cost reduction?

The industry generally believes that AI application innovation is entering an intensive exploration period with the gradual improvement of large model performance, but the high reasoning cost is still the key factor restricting large-scale application of large models.

"To reduce the price of big model reasoning is to speed up the explosion of AI applications." Liu Weiguang, senior vice president of Alibaba Cloud Intelligent Group and president of the Public Cloud Business Unit, predicted that after the price adjustment, the call volume of big model APIs will grow thousands of times in the future.

In Liu Weiguang's view, the technology dividend and scale effect of public cloud bring huge cost and performance advantages, which greatly reduces the cost of model reasoning and accelerates the speed of model reasoning. He calculated for the reporter that the price of the same open source model on the public cloud is far lower than that of the private deployment. "Taking the Qwen-72B open source model and the monthly consumption of 100 million tokens as an example, it only costs 600 yuan per month to directly call API on Alibaba Cloud Bailian, and the cost of privatization deployment exceeds 10000 yuan per month on average."

Several industrial revolutions have told mankind that scientific and technological changes are essentially accompanied by a sharp decline in marginal costs, which in turn has promoted the diffusion of technology. Some insiders said that on the one hand, the price reduction of AI big models will help the public popularize AI, and on the other hand, it will also conform to the market demand. "Because OpenAI also follows the path of high fees for advanced models and gradually free for low version models; at present, GPT3.5 is already free, while the best models in China are only at the level of GPT3.5-4, and there is no competitiveness without low prices."

Lower entry threshold

Cloud+API will become a big model and use mainstream methods

It is worth noting that the AI models of this round of price reduction are all general models, and their implementation needs to provide services to applications in various scenarios through the API network interface. After this round of price reduction, does it mean that the business model of China's big model API is infinitely close to nonexistence?

According to the analysis of some insiders, this kind of free service indeed lowers the entry threshold for commercial customers. However, according to various detailed terms, if it is really high-quality, large-scale and high concurrent, it will definitely cost money.

Liu Weiguang also believes that whether it is an open source model or a commercial model, public cloud+API will become the mainstream way for enterprises to use large models. "Because of the natural openness of cloud manufacturers, they can provide developers with the most abundant models and tool chains, allowing developers to choose models, adjust models, build applications and provide external services in one stop."

This means that this round of price war of AI big model is mainly useful for C-end users, but not for B-end developers.

For the AI large model, it is only a year from the comparative power to the comparative price. The chain reaction of the giants on price has far-reaching impact. "It can be clearly seen that if the technology cannot open the gap, the price must also not open the gap, and the final value will gradually approach zero." Analysts said that although it is still an important infrastructure, the value will change from "oil" to "water".

"In addition, there is another possibility that the new generation of applications is developed by 'hydropower', or even supported by the investment of 'hydropower plants'." The analyst predicted that if small and medium-sized manufacturers have their own modeling capabilities through self research and other means, or even become "nuclear power", the huge change of AI model industry will also come.