Bookmark Ctrl+D Save this page as a bookmark to fully understand the latest information, which is convenient and fast. You can also download desktop shortcuts. Click Download | Sina Technology | Sina homepage | Sina Navigation

Dithering can't catch up with Sora

Source: Alphabet

On the popular track of AI Wensheng Video, the clip of Douyin is being thrown further and further by Sora of OpenAI.

Recently, Adobe, an American software giant, announced that it would add a number of AI tools for cultural videos to the new version of Premiere Pro, a well-known video editing software. Sora, which has shaken the global technology circle in recent two months, and two similar products, Gen-2 and Pika, will join the "Adobe Family Bucket" in the near future.

With the help of third-party AI tools such as Sora, the new Premiere Pro can not only edit and process pre shot regular videos, but also generate AI videos based on the text input by users, and integrate the two into one.

Adobe released an official demonstration video: a man walked to the window to watch the night view of the whole city. Users can use Sora to generate a video of a rainy night in the city without taking a live shot, just input a text, and seamlessly connect with the previous video. The effect is almost unreal.

Sora made its debut in February this year, and OpenAI released several demonstration videos, but did not release the product progress and launch time. Today, Sora has been accepted by Adobe, which shows that it has made great progress in the past few months and is closer to open use.

On the other hand, the video clip software behind the dithering is also moving in the direction of AIGC (Artificial Intelligence Generated Content). However, up to now, Scissors has not yet achieved amazing results.

At present, there are a lot of AI play methods for clipping, including one click to film, clipping the same style, AI cloning voice, digital population broadcasting, etc., but they do not have the ability to directly generate video based on text. Its overseas version Capcut launched the Wensheng video function at the end of February, but the effect is far from Sora.

Today, apart from leading in technology and products, Sora is also favored by Adobe. Adobe has more than 33 million paying users worldwide; Sora is expected to gain subscription revenue share by accessing "Adobe Family Bucket", so as to initially build a business model.

This also means that it will be more and more difficult to catch up with Sora as AIGC's ability is still being polished.

Clipping is regarded by the outside world as the key to get on the AIGC era express. Thanks to the blessing of dithering, Cliping has become one of the mobile video editing software with the largest number of users, producing a large number of short videos every day; On this basis, it seems that it is easy to move from UGC (user production content) to AIGC.

On February 7 this year, Zhang Nan, a veteran of Douyin's meritorious service, resigned as CEO of the Group and led the team to cut the film. In her internal letter on the official announcement of job change, she said that AI image generation had a great impact on her and had great potential. She decided to "put everything down" and set out without hesitation.

The development foundation of Jianying is not bad, and the importance of dithering is also high. However, more than two months later, there was still little news about the film cutting, and Sora continued to make great progress.

In the AIGC era, dithering and the bytes behind it always seem to be slow.

Byte laid out AI eight years ago, set up a special laboratory and recruited a large number of elites in the industry, but the achievements remained in content review, automatic translation, search services, etc. In the second half of 2022, ChatGPT will ignite the industry; Bytes immediately increased the investment in large models and launched more than a dozen applications such as AI chat robot at one go, but they have not come up with technologies and products that shake the industry for a long time.

At the annual staff meeting at the end of January this year, Liang Rubo, CEO of Byte, sighed that Byte "all the big companies that should have had problems". He specifically named AI business and said: "The semi annual technical review at the company level will not discuss GPT until 2023, and the large model start-ups that have done well in the industry are all founded from 2018 to 2021."

Now, in the field of video content, which is the foundation of the company, OpenAI is trying to be the first, and the clipping and even the whole byte are slow again.

  1

Back to the big tree of dithering, Jianying had a great opportunity to take the lead in the field of AI Wensheng video.

Scissor Image will be launched in 2019, and the professional version of PC will be launched in 2021. It significantly lowers the threshold for UGC video creation. Ordinary people can generate a video of acceptable quality and release it to Tiao Tone with one click only after they have prepared the material and clicked a few times.

In addition, as a tool software, Cliping also has certain community attributes. In addition to the official video creation class, users can also refer to the video template produced by experts to create, that is, "cut the same style". This helps improve user retention and activity.

Relying on its rich functions and deep binding with dithering, Scissor Ying won more than 100 million active users in just three years, and jumped to the first place in similar software.

The growth of overseas version CapCut is also amazing.

Capcut was launched in 2020, closely related to TikTok. According to the mobile application analysis platform, Capcut currently has more than 200 million monthly active users. In addition, according to data from market research organization data.ai, as of August 2023, Capcut has more than 490 million users in iPhone and Android, equivalent to 1/4 of TikTok's global users.

In addition to the large number of users, another advantage of Scissor Image in the AI field is that it has a lot of video data that can be used for AI model training.

Data, algorithm and computing power are the three basic elements of AI big model, in which data is the foundation. In order to improve the comprehensive ability of the big model, we need to constantly "feed" multimodal data, mainly Internet text, images and videos.

OpenAI, Adobe and other companies do not directly master the data and must pay a third party to obtain it, which is expensive. According to media reports, OpenAI only spends 1 million to 5 million dollars on purchasing copyright news article licenses every year; Adobe purchased video clips from Midjournal, another big model service, at a price of US $3 per minute to train its own Wensheng video model.

In contrast, Scissor Image is tied to TikTok in China and backed by TikTok overseas. Every day, a large number of users use it to edit and upload videos. This allows Scissor Map to access a large amount of video content at a relatively low cost, laying the foundation for training large models and exploring AI cultural student video functions.

However, since 2019, Jianying has not developed a strong AIGC capability, but has started commercialization early.

Clipper has been online for VIP members for a long time. Users can pay 20 or 30 yuan per month to use exclusive materials, selected templates, etc. AI play is also listed as one of the members' rights. According to the official introduction, AI play methods mainly include "unlimited creation", "unlimited mirror movement" and "instant universe", and AI beautification editing based on existing video materials.

It is not difficult to see that the AI playing method of clip is far from that of literary videos like Sora. In addition, its charge is not cheap: members can get 1200 points every month, while using the "unlimited mirror" requires 480 points. After the points are exhausted, the user can choose to continue recharging, and the exchange ratio is 1 yuan: 100 points.

At the beginning of its birth, Scissor Image was positioned to lower the threshold of short video creation as much as possible, and promote the prosperity of dithering UGC ecology. In the past five years, it has indeed completed this task. Behind a large number of videos of sound shaking blockbusters, there are cutting and screening technologies and templates. In particular, after the release of interesting special effect videos, stampede videos and dithering celebrities, a large number of users quickly followed suit and jointly promoted the popularity; I'm afraid it's not easy to do this without the help of screen cutting.

But now it seems that the dithering still sets the cut ceiling too low. It had the opportunity to become an epoch-making product like Sora, but it has always remained in the category of video editing software.

Since the beginning of this year, Scissor Film has begun to catch up, but the first opportunity has been lost, the opponent has arrived, and the difficulty of catching up has rapidly increased. The clip was thrown away by Sora, and bytes still failed to jump out of the strange circle of "getting up early and catching up late" in the AI field. This also confirms Liang Rubo's previous criticism of "the gravity of mediocrity" from the side.

  2

It is difficult to cut and map even the whole byte AI sector to pay too much attention to help the business and pursue commercialization too early.

In addition to opening paid members and incorporating AI play methods into the benefit package, Clipping has also added many advertising spaces in the APP. For example, when users click "cut the same model", they will see not only mobile phone photos and video materials, but also banner ads hanging above the materials, which are almost irrelevant.

As a tool software with over 100 million users, the regular investment and capital pressure of Scissor Map will not be particularly great. Its early development of members and advertising may be related to byte's work style and evaluation criteria.

As we all know, byte is a super fast paced and highly involved Internet giant. If an individual, team or business cannot quickly bring visible output, it may be adjusted or even completely eliminated. Even AI that needs long-term investment cannot jump out of this invisible benchmark.

As early as 2016, Byte set up AI Lab artificial intelligence laboratory, and introduced many academic and industry elites. At that time, OpenAI was also fledgling, moving towards the vision of general artificial intelligence, and regarded itself as a non-profit organization.

In contrast, although AI Lab is called "Lab", it still needs close cooperation and service business. Its official website claims that its research focus is to develop new technologies serving the byte content platform; Specific fields include natural language processing, data mining, computer vision, machine learning, etc., which are closely connected with the business segments of dithering.

In the following years, Byte came up with a series of AI tools, such as Byte Translator, an AI writing robot Xiaomingbot, and search services for headlines and dithering. Although they are valuable, they are not innovative products that break through the cognitive boundary of AI field and define AI development paradigm.

Until the second half of 2022, when OpenAI has honed for many years, ChatGPT has swept the world, and generative AI has become the focus of global technology companies, Byte will devote more energy to this new wave.

Zhang Yiming, who has faded out of front-line byte management, has a strong interest in AI and encourages the team to invest heavily. Byte beating cannot miss AGI (General Artificial Intelligence), which is an indispensable partner for TikTok and TikTok to discover new growth opportunities in the world.

After the founder spoke, all business departments of Byte were moved by the news. Since then, Byte has launched more than 10 AI products in succession, such as bean bags, speakers, buttons, Gauth, etc., and also added AI functions to cut images, fly books, etc.

However, in this round of large-scale investment, the byte business needs as the starting point, benchmarking competitors' business logic has been continued. Products such as bean curd are scene based applications of existing AI technology, rather than the original exploration of AGI.

For example, last March, Microsoft launched 365 Copilot with GPT functions, which shocked the global office software market. A month later, Feishu announced that it would launch the AI assistant "My AI", which would provide multiple functions in the form of dialogue, including optimizing and renewing text content, creating an agenda, automatically summarizing meeting minutes, searching the company's internal knowledge base, etc.

A year later, GPT has blossomed everywhere in Microsoft's "Family Bucket", driving the latter's share price from $250 to more than $400. Instead of relying on My AI to complete the counter attack, Feishu announced layoffs at the end of March.

For another example, Sora released a demo video on February 16 this year, and CapCut announced the launch of this feature a week later. Each person can generate 5 videos for free every day. CapCut's technical strength is obviously not comparable to OpenAI, and its cultural video function is also relatively simple; Hurrying to go online and benchmark Sora inevitably has the taste of rubbing hot spots and forcibly completing KPIs.

Since Zhang Yiming named AGI, byte AI has gone through another year, and has not significantly narrowed the gap with OpenAI, or even expanded. The excessive inclination of byte AI towards pragmatism not only makes it miss opportunities in the past, but also may drag down the pace of catching up.

  3

Byte has been aware of the situation and harm caused by the AI sector's independent efforts and business rotation.

In November 2023, Byte will transfer talents from multiple departments to form the AI department Flow. Zhu Wenjia, head of TikTok technology, Zhu Jun, vice president of byte products and strategy, Hong Dingkun, vice president of byte technology, and Qi Junyuan, vice president of Feishu products all joined in. The "high allocation" of personnel reveals the intention of bytes to coordinate AI development and eliminate redundant construction through Flow.

At present, Flow Department is responsible for AI products with the highest byte exposure - bean bags, buttons, telephone ovens, etc. Although there are AI businesses such as Scissor Image, Flying Books and Vigorous Education, the volume has been gradually overshadowed by Flow.

On the other hand, the byte lark model has been polished for nearly a year, and the multi-mode model BuboGPT has also made progress, laying the foundation for the flow batch production AI application. Over time, Flow is expected to reverse the situation that byte AI is always one step slower.

However, bytes are not OpenAI after all, and there is no super gold owner like Microsoft. AGI is important, but it is not obvious to pull the existing business. It needs a longer time of precipitation and polishing to release commercial value; The input of bytes to AGI cannot be endless, and the short and medium term input-output ratio must be considered.

In the short term, there are many places where bytes need money, people, and resources, such as e-commerce and life services, which need a lot of real money and silver. The following phenomenon is that although Flow has raised the bar, byte AI still tends to give consideration to business needs.

According to the recent report of Tech Planet, Tiao Yin Life Service has just set up an AI team, hoping to create incremental business value by using AI technology, and has started the research and development of relevant AI products, including building a content creation platform related to life services.

The reason why Diaoyin Life Service has pulled up such a team may be related to the action of competitors Meituan and Hunglema.

Meituan recently started a small-scale test of the AI assistant service "Ask for a Small Bag", which can recommend takeout products that meet users' needs, as well as meal suggestions. At the beginning of April, Elemente released "AI Business Assistant" to retailers in the retail industry, which can generate various key business reports and key data for merchants' intelligence.

In this case, it may be one of the basic goals of Tiao Yin Life Service's entry into AI to prevent Meituan from being too hungry to focus on beauty.

Under the influence of internal and external factors, bytes have no conditions to accumulate as well as OpenAI. It is the dual goal of byte AI to not only catch up with the AGI trend, but also have the ability to quickly land and service businesses.

Under the supervision of Zhang Yiming and Liang Rubo, the outside world does not need to question Byte's determination to do AI. However, if the byte still cannot keep up with the pace and level of OpenAI in the past year or two, it may consider another option: return to the role of "water seller" and become a provider of AGI training materials.

As mentioned earlier, data is one of the three basic elements of the big model, and today's headlines, dithering, TikTok, etc. under byte have accumulated hundreds of millions of texts, pictures and videos. Byte can use these data to train its own big model, or it can consider going further and selling it to third-party companies such as OpenAI on the premise of properly solving security and privacy issues.

Acting as a "water seller" in the AGI era is actually a variant of the traffic business that Byte is good at. The current cash cow of Byte - advertising and e-commerce business are based on the realization of flow; If the AI company is transformed into a new customer, the problem of the second growth curve of the byte mind will be solved.

On the other hand, since OpenAI can cooperate with Microsoft and Adobe, it is not inconceivable to cooperate with Byte. After all, dithering and TikTok are the largest landing scenarios for Sora and other AIGC services. If the byte and OpenAI reach a "race and cooperation" relationship, the byte will jump out of the slow step by step cycle and get on the AI Express with the help of OpenAI.

(Statement: This article only represents the author's view, not Sina.com's position.)

Share to:
preservation   |   Print   |   close