Last week, Microsoft researchers detailed a brand new AI mannequin they’ve developed that may take a nonetheless picture of a face and an audio clip of somebody talking and routinely create a practical wanting video of that individual talking.

A Microsoft signal is seen on the firm’s headquarters on March 19, 2023 in Seattle, Washington. New Microsoft AI animates faces from images.

New York (CNN) — The Mona Lisa can now do greater than smile, due to new synthetic intelligence expertise from Microsoft.

Last week, Microsoft researchers detailed a brand new AI mannequin they’ve developed that may take a nonetheless picture of a face and an audio clip of somebody talking and routinely create a practical wanting video of that individual talking. The movies — which will be made out of photorealistic faces, in addition to cartoons or paintings — are full with compelling lip syncing and pure face and head actions.

In one demo video, researchers confirmed how they animated the Mona Lisa to recite a comedic rap by actor Anne Hathaway.

Outputs from the AI mannequin, referred to as VASA-1, are each entertaining and a bit jarring of their realness. Microsoft mentioned the expertise might be used for training or “bettering accessibility for people with communication challenges,” or doubtlessly to create digital companions for people. But it’s additionally simple to see how the device might be abused and used to impersonate actual individuals.

It’s a priority that goes past Microsoft: as extra instruments to create convincing AI-generated photographs, movies and audio emerge, experts worry that their misuse may result in new types of misinformation. Some additionally fear the expertise may additional disrupt artistic industries from movie to promoting.

For now, Microsoft mentioned it doesn’t plan to launch the VASA-1 mannequin to the general public instantly. The transfer is much like how Microsoft accomplice OpenAI is dealing with issues round its AI-generated video tool, Sora: OpenAI teased Sora in February, however has up to now solely made it accessible to some skilled customers and cybersecurity professors for testing functions.

“We are against any habits to create deceptive or dangerous contents of actual individuals,” Microsoft researchers mentioned in a weblog publish. But, they added, the corporate has “no plans to launch” the product publicly “till we’re sure that the expertise can be used responsibly and in accordance with correct laws.”

Making faces transfer

Microsoft’s new AI mannequin was skilled on quite a few movies of individuals’s faces whereas talking, and it’s designed to acknowledge pure face and head actions, together with “lip movement, (non-lip) expression, eye gaze and blinking, amongst others,” researchers mentioned. The result’s a extra lifelike video when VASA-1 animates a nonetheless photograph.

For instance, in a single demo video set to a clip of somebody sounding agitated, apparently whereas enjoying video video games, the face talking has furrowed brows and pursed lips.

The AI device will also be directed to provide a video the place the topic is wanting in a sure route or expressing a selected emotion.

When wanting intently, there are nonetheless indicators that the movies are machine-generated, comparable to rare blinking and exaggerated eyebrow actions. But Microsoft mentioned it believes its mannequin “considerably outperforms” different, comparable instruments and “paves the way in which for real-time engagements with lifelike avatars that emulate human conversational behaviors.”

The-CNN-Wire
™ & © 2024 Cable News Network, Inc., a Warner Bros. Discovery Company. All rights reserved.