Voicebox, a text-to-speech generative AI model, is introduced by Meta

Voicebox, a text-to-speech generative AI model, is introduced by Meta

Meta has unveiled Voicebox, a’state-of-the-art’ generative AI model that transforms text to speech and includes capabilities for editing audio and working across languages.

Details on Voicebox, a text-to-speech generative AI model, is introduced by Meta

A video posted on Meta CEO Mark Zuckerberg’s Instagram Channels demonstrated how Voicebox could read text in a range of speech styles, remove loud interruptions from audio tracks, learn and imitate speakers’ voices, and even produce output in multiple languages.

On Friday, Meta published a blog post describing how the model could perform tasks for which it had not been properly taught.

Speech can be generated in English, French, German, Spanish, Polish, or Portuguese using the multilingual model. Other features on the list were various text-to-speech, style transfer, content correction, in context text-to-speech, and noise removal.

“This type of technology could be used in the future to help creators easily edit audio tracks, allow visually impaired people to hear written messages from friends in their voices, and enable people to speak any foreign language in their own voice,” according to Meta’s blog post.

It was claimed that the model may give virtual assistants and non-player characters in the metaverse more natural voices.

Voicebox, according to Zuckerberg, is still a “research project,” but Meta plans to expand on it.

The video clip ended with what seemed like Meta’s CEO saying “more soon” in Polish.

Meta has been building AI models to handle many types of media, and several of these have been provided open source for research purposes.

Leave a Reply