AI driven super resolution technology landing practice - a truly stable personal space of Netease Yunxin - OSCHINA - Chinese open source technology exchange community

Implementation Practice of AI Driven Super Resolution Technology

In recent years, with the rapid development of deep learning technology, AI based super-resolution technology has shown broad application prospects in the field of image restoration and image enhancement, and has attracted the attention and attention of academia and industry. However, in the field of RTC video, many AI algorithms cannot meet the application requirements in actual scenarios. This paper will focus on the landing of AI technology from research to deployment, and share the opportunities and challenges faced by the landing application of super-resolution technology in the RTC field.

1、 Overview of super-resolution technology

1. The proposal of super-resolution technology

The proposal of super-resolution technology

The concept of super-resolution was first proposed by Harris and Goodman in the 1960s. It refers to the technology of generating high-resolution images from low resolution images through some algorithm or model, and recovering more detailed information as much as possible, also known as spectral extrapolation. However, at the beginning of the study, the spectrum extrapolation method was only used for simulation under some assumptions, and was not widely recognized; Until the super-resolution method of single image was proposed, the super-resolution technology began to be widely studied and applied. At present, it has become an important research direction in the field of image enhancement and even computer vision.

2. Classification of super-resolution technology

Classification of super-resolution technology

The super-resolution methods of a single image can be divided into interpolation based, reconstruction based and learning based methods according to different principles. Due to the simple algorithm principle and limited application scenarios, the super-resolution effect of the first two methods in actual scenes is not ideal; Learning based method is the best super-resolution method in practice. Its core includes two parts: the establishment of algorithm model and the selection of training set. According to the algorithm model and training set, learning based methods can be divided into traditional learning methods and deep learning methods. Generally speaking, the algorithm model of traditional learning methods is relatively simple and the training set is relatively small. The deep learning method generally refers to the convolutional neural network method trained with a large amount of data, which is also a hot research topic in the academic circle at present. So next I will focus on the development process of super-resolution methods based on deep learning.

3. DL-based SR

DL-based SR

SRCNN is the first attempt of deep learning method in super-resolution problem. It is a relatively simple convolution network, which is composed of three convolution layers, each of which is responsible for different functions. The first convolution layer is mainly responsible for extracting high-frequency features, the second convolution layer is responsible for completing the nonlinear mapping from low definition features to high-definition features, and the last convolution layer is responsible for reconstructing high-resolution images. The network structure of SRCNN is relatively simple, and the super-resolution effect needs to be improved, but it establishes the basic idea of deep learning method in dealing with super-resolution problems. Later in-depth learning methods basically follow this idea to carry out super-resolution reconstruction.

Later ESPCN, FSRCNN and other networks have made some improvements based on SRCNN. The number of network layers is still relatively shallow, the number of convolution layers does not exceed 10, and the superresolution effect is not particularly ideal. At that time, the training of deep convolutional networks was problematic. Generally, for convolutional neural networks, when the number of network layers increases, the performance will also increase. But in practical applications, people find that when the number of network layers increases to a certain extent, due to the principle of back propagation, the gradient will disappear, resulting in poor convergence of the network and reduced model performance. This problem was not solved until ResNet proposed the residual network structure.

VDSR is the first application of residual networks and residual learning ideas on super-resolution problems. It increases the number of layers of super-resolution networks to 20 for the first time. The advantage is that the residual characteristics can be directly learned by using residual learning. The network convergence will be faster and the super-resolution effect will be better. Later, some convolutional neural networks proposed more complex structures. For example, SRGAN proposed to use the generative countermeasure network to generate high-resolution images. SRGAN consists of two parts, one is the generation network, the other is the discrimination network. The function of the generating network is to generate a high-resolution image based on a low resolution image, while the function of the discriminant network is to determine the high resolution image generated by the generating network as false, so that the network will continue to play a game between the generating network and the discriminant network during training, and finally reach a balance, so as to generate a high-resolution image with realistic details and textures, It has better subjective visual effect. Other deep convolution network methods, such as SRDenseNet, EDSR, and RDN, use more complex network structures. The convolution layer of the network is getting deeper and deeper, and the super-resolution effect on a single image is getting better and better.

Overall development trend

The general trend of the development of super-resolution technology can be summarized as follows: from traditional methods to deep learning methods, from simple convolutional network methods to deep residual network methods. In this process, the structure of the super-resolution model is becoming more and more complex, the network level is getting deeper and deeper, and the super-resolution effect of a single image is getting better and better, but there will also be some problems.

2、 Requirements of real-time video tasks and challenges of SR

Requirements for video processing tasks

In the field of RTC, most of the video processing tasks are live broadcast, conference and other instant communication scenarios, which require high real-time performance of the algorithm, so real-time performance of the video processing algorithm is a priority. Secondly, the practicability of the algorithm. When users use live broadcast or conference, the video quality captured by the camera is sometimes low, which may contain a lot of noise; In addition, the video will be compressed first when encoding and transmitting, and the compression process will also lead to the degradation of image quality. Therefore, the actual application scenarios of RTC are relatively complex, and many video processing methods, such as super-resolution algorithm, are ideal in the research. Finally, how to improve the experience of users, especially mobile users, reduce the computational resource occupation of algorithms, and apply to more terminals and devices is also a must for video tasks.

For these requirements, the current super-resolution methods, especially those based on deep learning, have many problems. At present, most of the academic research on super-resolution is still limited to the theoretical stage. If image super division, especially video super division, is to be implemented on a large scale, some practical problems must be solved. The first is the problem of network models. In order to pursue better super-resolution effect, many current deep learning methods use large models with more and more parameters, which will consume a lot of computing resources and cannot be processed in real time in many actual scenarios. The second is the generalization ability of the deep learning model. For all kinds of deep learning models, there will be the problem of training set adaptation. The training sets used in training are different, and their performance in different scenarios is also different. The models trained with open data sets may not have the same good performance in actual application scenarios. Finally, there is the problem of the effect of supersampling in real scenes. At present, most of the academic supersampling methods are about ideal scenes, completing the reconstruction from down sampled images to high-resolution images. However, in real scenes, image degradation not only includes down sampled factors, but also many other factors, such as image compression, noise, blur, etc.

To sum up, the current AI based super-resolution method, in the RTC video task, faces the main challenge that can be summarized as how to achieve video quality enhancement with good real effects by virtue of a relatively small network, that is, how to "make the horse run faster and let the horse eat less grass".

3、 The development trend of video super-resolution technology

First, the deep learning method will still be the mainstream of super-resolution algorithm.

Because the effect of traditional methods on super-resolution tasks is not ideal, and the details are relatively poor. Deep learning method provides a new idea for super-resolution. In recent years, the super-resolution method based on convolutional neural network has gradually become the mainstream method, and the effect is also improving.

Deep learning method

As can be seen from the above figure, in recent years, the number of papers on AI based super-resolution methods has been one-sided compared with traditional methods, and this situation will be further expanded in the next few years. Although there are some problems, with the emergence of some lightweight networks, deep learning methods may have greater breakthroughs in landing applications in the future, and these problems will also be solved. Deep learning methods will still be the mainstream research direction of superresolution.

Secondly, some lightweight networks with smaller parameters will play a greater role in promoting the implementation of the super division algorithm.

Because various deep convolutional network methods at present, such as deep residual networks such as EDSR and RDN, are difficult to meet the needs of real-time video transmission, some smaller lightweight networks will have better results for real-time tasks.

Third, future super-resolution methods will focus more on real scene tasks.

Most of the SR methods in the academic field focus on the undersampling problem, which does not perform well in the real scene. In the real scene, there are various image degradation factors. Some more targeted methods, such as the super-resolution tasks including compression loss, coding loss and various noises, may be more practical.

Academic super model training mode

4、 NetEase Yunxin AI super division algorithm

NetEase Yunxin AI super division algorithm

In the RTC field, because the video file is too large, we need to encode it, and then transmit it to the receiver for decoding and playback. Because the essence of coding is video compression, when the network is poor, the coding quantization parameters will be large, which will cause serious compression, resulting in blocking effect and other distortions in the output image, resulting in blurred image quality. In this case, if the decoded video is directly super divided, the compression loss will also be amplified, and the super divided effect is often not ideal. In response to these problems, NetEase Yunxin proposed a video super-resolution method based on code loss recovery, adopted the strategy of data driving and network design, simulated the real distortion scene through data processing, and optimized layer by layer from model design to engineering implementation, which has made some breakthroughs in the two major problems restricting AI super-resolution technology, Good results have been achieved in terms of model real-time performance and real scene super segmentation effect.

Algorithm strategy

The above is some practical experience of NetEase Yunxin in promoting the implementation and application of AI driven hyper division technology, hoping to enlighten and reference everyone.

More technical content, please follow WeChat official account [NetEase Smart Enterprise Technology+]

generation

Code e person 2024-06-09 10:03

Prepare the next project and try it

Wang Zheng 2024-06-08 09:46

You said, "All the tests are graduate students" and smiled. I don't know my level is low.

kangert 2024-06-09 20:10

Really need to practice

CodeDoger 2024-05-02 20:48

35 It's too old to go to work and too early to retire at 60

osc_27546117 2024-06-09 22:36

Learned electric programming and expected its progress

lyh97157268 2024-06-09 20:58

Like c++

oldpig 2024-04-28 09:59

”Huawei contributed all the source code "?, the title is completely inconsistent with the content.

abeet 2024-06-08 20:38

There are no pictures, for fear that we will learn, right

Single structure 2024-05-11 10:09

Selected as Open Source China's disgrace pillar

Kevin586 2024-06-08 14:41

Dream is garbage, which can also be listed and refresh my cognition

zhangleijie 2024-06-08 10:08

pretty good

Li Yinghui 2024-05-09 16:40

Buddhism has a good word, evil opinion. In dealing with the world, it is meaningless to draw conclusions from preset positions; It is also important to receive good logic training.

SnailJob 2024-06-09 09:13

Yes, please continue to follow Snail Job

Yoona520 2024-05-17 16:34

Zhou Hongyi is now living more and more like a clown. If he stays behind the scenes, he has to become an online celebrity. Can you learn from Lei Jun?

infoworld 2024-05-11 15:12

Universities should use open source free software instead of commercial ones. In this way, hands and feet will not be tied technically.

Xiao Xu Middle aged 2024-06-08 12:43

Do AI functions need networking? Will it be 404?

muwanqing123 2024-06-09 08:28

Bullshit authentication

kakai 2024-05-10 10:21

The world only knows that Android was created by Google. Several people know that Android is only a product acquired by Google. Similarly, what is the problem with Huawei's contribution to the collection of OGG open source work and integration into its own proprietary product line?

yh2216 2024-06-09 23:03

I remember saying that one year C++was the language of the year,

zzeric 2024-04-28 20:01

Although France is the parent community, the core developers of OCCT on github are all Russians. Without Russians, the French parent community cannot continue to operate. So Huawei took over, moved to China, changed its name and resumed open source and community operations. What's the problem?

yh2216 2024-06-09 13:15

Xiao_f 2024-06-07 22:59

One thing to say, compared with other domestic manufacturers, Qwen's relaxed licensing fully demonstrates the style of a large factory

kangert 2024-06-09 20:07

The problem of docker hub is very uncomfortable

golyu 2024-06-10 14:45

If only this was the library of solidjs

osc_566335 2024-04-28 14:44

This is also called floor washing? Does it mean that Tesla will not wash the floor if it releases all the source code? Some people HWptds? That is to say, the language is ambiguous, which will also rise to the washing ground? Are some people too focused? Think the people he pays attention to must be staring at?

Francesca 2024-05-19 18:00

Wine runs the Android emulator of Windows. Chrome OS is installed in the Android emulator. Linux environment is installed in chrome OS. Linux environment is installed in the Linux environment. Wine is installed in the Android emulator

Chief taxi captain 2024-05-17 11:17

I suggest that 360 open source all its products, and then become the leading enterprise in the domestic open source industry through open source, leading everyone to compete with foreign enterprises

GDWhisperer 2024-05-15 17:23

I transferred tens of thousands of yuan to my own account, which was under risk control. How did I do this? The bank should be responsible for this**

sunday12345 2024-05-15 18:31

What does the bank do? It's blamed on the remote desktop. Persimmons really pick up soft pinches~?

Ding Yun H 2024-06-07 20:44

There is no querydsl. Since querydsl was used, I can't look at other forms anymore

Ai East 2024-06-10 19:11

Absolutely easy to use

xiaoqibabby 2024-05-15 17:36

The bank is strongly required to be responsible for

iVista 2024-06-10 18:13

I was blinded by the math test

Bright 2024-05-19 23:25

What a fool! I killed myself. How can people deal with me later.

Xiaoxia cat ball 2024-06-09 21:29

Very good, come on

Happy LeapFrog 2024-05-18 09:18

But the question is: "What's the use of this for ordinary Android users?" Now the answer seems to be: "Almost nothing.".

Xiao Xu Middle aged 2024-06-08 10:12

First place in making money!! Money and treasures will be plentiful

zhuzhua 2024-05-21 10:08

I'm laughing to death. Those who have been deeply kidnapped dare not pay? Who will use the domestic open source framework of small companies in the future will be 213!!! Wait for harvesting later

pan3793 2024-06-07 22:26

Let AI give AI a score

MrChen89 2024-04-29 09:18

There are a group of people like this. I don't know what they have experienced. When it comes to HW, I can't say anything good, even if it's neutral

Xiao Xu Middle aged 2024-06-10 07:05

Learn

Shuimu Yi'an 2024-05-20 09:58

The news should be read continuously. I'm waiting for the third news besides rustdesk and teamviewer. Localized remote desktop software is far ahead.

osc_92224065 2024-04-29 10:57

Long term oppressed outsourcing of state-owned enterprises

zhy 2024-05-16 13:16

At the end of Shannon is Nong

monkey_cici 2024-05-09 00:25

My I9 CPU, 64GB memory module and 3080Ti computer are inferior to the top configuration of 19999 on a tablet

Ma Nong Little Fatty Brother 2024-05-16 14:40

I give you six seconds. I give you six moves with the same effect in the martial arts contest, which shows the invincibility and confidence of the master

Francesca 2024-06-09 13:21

But the end of closed source must be open source, because many people who are dissatisfied with closed source have created open source, so the end of open source is not necessarily closed source, but to find a business model that is open source= Free Admission

One code Yma 2024-05-06 09:14

My technical article was moved by CSDN. Why didn't anyone step on the sewing machine? This kind of report is a joke to me. The monsters with background are fine, and the monsters without background fight to death

gamedot 2024-05-17 11:14

Old Zhou is deeply concerned about Huawei's great cause of open source. He is not a Huawei person, but has Huawei's soul.

One code Yma 2024-05-09 09:58

Recently, I often go to interviews. People who hate Ali background most regard me as a fool, even though I am a fool

zoujiaqing 2024-06-07 21:21

Spring boot was not updated last year

H Fine water and long flow H 2024-06-10 09:39

I haven't heard about whether fartran has paid. I'm in the top ten

Small and beautiful software development 2024-06-08 23:03

It's mainly about waist training

zoujiaqing 2024-06-07 21:22

I dare not use it

Yeah, for 2024-05-17 13:42

That's too right. Old Zhou can't control Google, but he can control 360. Do not do to others what you do not want. All 360 products should be opened first.

Monkeys think of apes 2024-05-31 18:31

You can cheat your brother. Just don't cheat yourself

Qin Liming 2024-05-11 09:12

be devoid of any sense of shame

Francesca 2024-06-10 16:19

Be ignorant. This thing has a long history. It is used for scientific computing and has high performance

-SORA- 2024-04-30 17:07

When this happened in a foreign country, the comment area suddenly became very objective and rational**

brucepapa 2024-06-09 21:02

I also have several backaches... After a few days of exercise, it will be much better to focus on stretching the back muscles.

Implementation Practice of AI Driven Super Resolution Technology

1、 Overview of super-resolution technology

1. The proposal of super-resolution technology

2. Classification of super-resolution technology

3. DL-based SR

2、 Requirements of real-time video tasks and challenges of SR

3、 The development trend of video super-resolution technology

4、 NetEase Yunxin AI super division algorithm

Hot content

Popular comments of the whole site

About the author

Author's Album

Author's other popular articles

Hot News

Hot software

OSCHINA Community

Online tools

Introduction

QQ group

Public account

Video number

Implementation Practice of AI Driven Super Resolution Technology

1、 Overview of super-resolution technology

1. The proposal of super-resolution technology

2. Classification of super-resolution technology

3. DL-based SR

2、 Requirements of real-time video tasks and challenges of SR

3、 The development trend of video super-resolution technology

4、 NetEase Yunxin AI super division algorithm

Hot content

Popular comments of the whole site

About the author

Author's Album

Author's other popular articles

Hot News

Recommended attention

Hot software

OSCHINA Community

Online tools

Introduction

QQ group

Public account

Video number