字节跳动将大模型价格打进“厘时代” 腾讯跟不跟？

[TechWeb] At least half of the heat in the technology circle this week comes from big models. OpenAI、 Google, Byte Jump, and Tencent have upgraded their big models in front and behind. The most important thing is Byte Jump, which is not only the application of models, but also the price.

On May 15, Byte Beat released the big model of Doubao and the price of the big model that can affect the industry. Among them, the model reasoning input price of the general model of bean curd, pro-32k version, is only 0.0008 yuan/1000 Tokens, while the price of models of the same specification on the market is generally 0.12 yuan/1000 Tokens, 150 times the price of the bean curd model. The general model of bean curd pro-128k version, the input price of model reasoning is 0.005 yuan/1000 Tokens, 95.8% lower than the industry price.

This means that byte jitters have reduced the pricing of Token by an order of magnitude, from the point based pricing to the per cent based pricing era. According to the price published by the volcanic engine, one yuan can buy 1.25 million Tokens of the main model of Doubao, which is about 2 million Chinese characters, equivalent to three Romance of the Three Kingdoms.

Will other domestic manufacturers follow the big model "price war" triggered by byte hopping?

Response from Tencent, Baidu and other manufacturers

On May 17, at the Tencent Cloud Generated AI Industry Application Summit, Wu Yunsheng, vice president of Tencent Cloud and head of Tencent Cloud Intelligence, did not respond positively when asked about this question. He said, "Tencent pays more attention to the improvement of its capability of large models and is committed to providing products with both capability and price for the industry."

On the same day, Jiang Jie, vice president of Tencent Group, announced at the conference that the overall performance of Tencent's hybrid model has ranked the first in China through continuous iteration, and some of its Chinese ability has equaled GPT-4.

On May 15, Baidu said, "The use of the big model should not only depend on the price, but also on the comprehensive effect. Only by making the AI application effect better, the response speed faster, and the distribution channels wider, can people really feel the convenience of AI for social production." Suspected of responding to the big model price war.

At the same time, Baidu disclosed that Wenxin's big model has handled more than 249 billion Tokens on a daily basis, and emphasized that "the 'closed source big model+public cloud' can achieve better performance and lower cost than the open source big model, and thus promote the prosperity of AI application ecology."

Some insiders said that the price reduction of large models has become a trend since this year, which will further accelerate the landing of the application end.

In fact, before the byte jump, many manufacturers have announced to reduce the use price of large models. On May 11, the new price system was officially announced by the Zhipu Big Model. The call price of the entry-level GLM-3 Turbo model dropped from 5 yuan/million Tokens to 1 yuan/million Tokens, a drop of up to 80%.

On May 6, DeepSeek, an AI company owned by the private equity giant Magic Square, released a new second-generation MoE big model, DeepSeek-V2. At present, the price of DeepSeek-V2 API is 1 yuan per million Tokens input and 2 yuan per million Tokens output (32K context), which is only about one percent of GPT-4 Turbo.

This trend also occurs abroad. Since last year, OpenAI has reduced the price for four times. In the spring update activity just ended, OpenAI released the latest multimodal large model GPT-4o, which not only greatly improved the performance, but also reduced the price by 50%.

The technical strength, training costs, application scenarios, etc. of each major model are different, which also leads to different pricing. Behind the price reduction is the continuous optimization of the structure of the large model, training and other costs.

Tan Dai, president of Volcano Engine, said frankly that byte beating can achieve low price on large models because it can do well in model structure, training, production and other technical aspects, and there are many optimization methods to achieve price reduction.

In addition, the market has a high voice for price reduction, and the platform must make trial and error costs very low for everyone to use. Only a large amount of use can polish a good model and significantly reduce the unit cost of model reasoning.

DeepSeek-V2 realized the reduction of the cost of the large model, especially the reasoning cost, through architecture innovation. Li Yanhong once disclosed at the Baidu AI Developers Conference Create 2024 that compared with a year ago, the reasoning performance of Wenxin model has increased 105 times, while the reasoning cost has decreased to 1%.

Speed up the implementation of large-scale model commercialization

The continuous decline in the pricing of large models is expected to lead to faster commercialization.

This week, OpenAI, Google, Byte Beat and Tencent disclosed their latest progress in the field of big models, all focusing on the application side. By 2024, the implementation of application scenarios will become the main theme of the competition.

Hard cores such as OpenAI, for the newly released multimodal large model GPT-4o, there is no previous rigid emphasis on parameter indicators, but instead, it chooses to focus on displaying the user experience in multiple scenarios. This is considered to be the performance of OpenAI accelerating application landing.

Public reports show that the number of active users of OpenAI exceeded 1 billion in the month, and the annual revenue rate reached 2 billion dollars last December. Due to the rapid growth of revenue, OpenAI adopts the unique revenue statistics method of "annual revenue rate", that is, the revenue of last month is multiplied by 12.

In China, Baidu, Alibaba and others have started to make money by using AI. According to the latest financial report, in the first quarter of this year, Baidu's smart cloud revenue was 4.7 billion yuan, up 12% year on year, and achieved profitability, of which the revenue from generative AI accounted for 6.9%. Based on this calculation, Baidu's generative AI revenue in the first quarter was about 324 million yuan.

Alibaba and Tencent did not disclose the specific revenue amount brought by generative AI. However, in the Q4 financial report of fiscal year 2024, Ali said that AI related revenue increased by three digits. Tencent disclosed the role of AI in promoting its business. For example, after the upgrading of advertising technology platform, the recommendation accuracy and delivery efficiency of Tencent's advertising business were significantly improved.

In addition, the "AI content" in the financial reports of A-share listed companies such as iFLYTEK and 360 also increased significantly. Among them, the revenue of iFLYTEK in the first quarter increased by 26%, and the loss expanded to 300 million yuan. The reason for the loss is, on the one hand, the seasonal characteristics of business account for a relatively small proportion of the annual income, but just need to be invested in; On the other hand, we have firmly invested in the research and development of large models, the independent control of core technologies and the control of the industrial chain, and the implementation and expansion of large model industries. Among them, the research and development expenses have increased by more than 100 million yuan compared with the same period last year, while the sales expenses have also increased.

Generative AI may have made good progress in helping enterprises improve quality and efficiency. But it may not be so easy to help enterprises make profits.

At present, the application of large models is still in its early stage, and there is still much room for development. QuestMobile data shows that as of March this year, the number of users in the AIGC industry based on the large model was 73.8 million, although it increased eight times year-on-year, it also accounted for only 6% of the number of mobile Internet users.

According to Ping An Securities Research Newspaper, when the capability of the big model reaches a certain level, it is bound to be applied. The big model manufacturers are expected to accelerate the formation of the commercial closed-loop of the big model industry chain by improving their product cost performance ratio and boosting the promotion and deployment of downstream applications.