home page  >  industry  >  key word  >  Latest information of Tutor Video  >  text

Tencent pushed a new graphics video model Follow Your Pose v2 to generate multi player sports videos

2024-06-11 11:47 · Source: Home of webmaster

Message from webmaster's home (ChinaZ. com) on June 11: Tencent's hybrid team, together with Sun Yat sen University and Hong Kong University of Science and Technology, launched a new Graphic video The model is named "Follow Your Pose-v2". This model has achieved a leap from single person to multiple people in the field of video generation, and can handle group photos of people, so that all people can move in the video at the same time.

Main highlights:

  • Support multi person video action generation: realize the generation of multi person video action with less reasoning time.

  • Strong generalization ability: high-quality video can be generated regardless of age, clothing, race, background clutter or action complexity.

  • Daily life photos/videos are available: Daily life photos (including snapshots) or videos can be used for model training and generation without looking for high-quality pictures/videos.

  • Correctly handle the occlusion of people: in the face of the problem of multiple people blocking each other in a single picture, it can generate an occlusion picture with correct context.

 image.png

Technical realization:

The model uses the "optical flow director" to introduce background optical flow information, which can generate stable background animation even when the camera is shaking or the background is unstable.

Through the "Inference Graph Guider" and "Depth Graph Guider", the model can better understand the spatial information of characters in the picture and the spatial position relationship of multiple characters, and effectively solve the problems of multi character animation and body occlusion.

Evaluation and comparison:

The team proposed a new benchmark Multi Character, which contains about 4000 frames of multi character videos to evaluate the effect of multi character generation.

The experimental results show that the performance of "Follow Your Pose-v2" is better than that of "Follow Your Pose-v2" on two public data sets (TikTok and TED speeches) and seven indicators newest More than 35% technology.

Application prospect:

Image to video generation technology has broad application prospects in film content production, augmented reality, game production, advertising and other industries, and is one of the AI technologies that will receive much attention in 2024.

Other information:

Tencent's hybrid team also announced the acceleration library of the large open-source model of cultural map (hybrid DiT), which greatly improved the reasoning efficiency and shortened the time of map generation by 75%.

The threshold for using the hybrid DiT model is lowered. Users can call the model with three lines of code in the official model library of Hugging Face.

Address: https://arxiv.org/pdf/2406.03035

Project page: https://top.aibase.com/tool/follow-your-pose

report

  • Related recommendations
  • Everyone is watching

Words everyone is searching for today:

Hot text

  • 3 days
  • 7 days