
    Google I/O Conference: Google launches voice dubbing and lip sync speakers for universal translators

    [Original by Zhongguancun Online] Author: Biscuit

    Google is testing a powerful new translation service that can translate videos into a new language, while also synchronizing the lips of speakers with words they have never said before. But Google emphasized the possibility of abuse and the measures taken to prevent abuse.

    "Universal Translator" was presented at Google I/O Conference by James Manyika, who is responsible for the company's new "technology and society" department. It is taken as an example to illustrate that it has only recently become possible due to the progress of artificial intelligence, but it also brings serious risks that must be considered from the beginning.

    The "experiment" service uses input video. In this case, it is the lecture of the online course originally recorded in English, transcribes the voice, translates it, regenerates the voice in the language (matching style and tone), and then edits the video to make the speaker's lips closer to the new audio.

    So it's basically a deepfaker, right? Yes, but technology used for malicious purposes elsewhere is truly practical. In fact, some companies in the media world are doing this kind of thing now, and re dubbing in post production for more than a dozen reasons. (The demonstration was impressive, but it must be said that technology still has a long way to go.)

    However, these tools are professional tools provided in strict media workflow, rather than check boxes on YouTube upload page. Universal Translator is not - but if it is, Google needs to consider the possibility that it can be used to create false information or other unforeseen dangers.

    Manyika called this "the tension between boldness and security", and it may be difficult to achieve a balance. But obviously, it can't just be widely released, so that anyone can use it without restrictions. However, the advantage is that, for example, online courses in 20 languages without subtitles or re recording are undeniable.

    "This is a great progress in learning and understanding, and we have seen gratifying results in the course completion rate," Manyika said. "But there is an inherent tension here: some of the same underlying technologies may be abused by bad actors to create deep fraud. Therefore, we have built a service with barriers to prevent abuse, and we only allow authorized partners to access it. Soon we will integrate new innovations in watermarking into our latest generation model to help meet the challenge of error information. "

