top of page

Clone Voices Are Coming. Media Companies And Brands See Big Opportunities Ahead.

A quarter century after the world marveled at cloned sheep, new AI technology is allowing human voices to be synthetically reproduced. In what is likely the beginning of a wave of announcements by audio companies, iHeartMedia said last week it will use Veritone’s synthetic voice technology as a tool to translate its podcasts into other languages. Veritone says the cloned voice opportunity is ripe for broadcast radio, too.

“There are plenty of use cases in radio. One of the biggest things that's a possibility is the ability to localize or personalize that content,” said Sean King, Senior VP at Veritone. Not only could a voice give a personal “thanks for listening” type message tailored for every market, but there are also possibilities to create real-time content such as localized news and weather updates. King said the process could be automated, using input from news or weather feeds.

In an interview, King said cloned voices will also improve the consistency of what listeners hear, in a way that voice-tracking cannot do. “If you're a talent and you need to come in and bust out 500 reads, is the quality of the line that I'm going to give at number seven going to be the same quality as number 497?”

Bert Weiss, host of the Westwood One-syndicated “The Bert Show” is among the early-adopters of the Veritone technology. With his Atlanta-based radio show now heard in 27 markets, Weiss said last month he will use the tool to cut custom liners, promos and content for affiliate stations without having to record items individually.

“Recutting production or creating new promotional content is vital but time consuming,” Weiss said. The new tech also allows his producer to develop original content and repurpose audio clips with nothing needed from him other than his consent. “Not only will that reduce production cost, but also expands where and how my voice can be heard,” Weiss said.

While iHeart is the first broadcast company with plans to use the technology companywide, other radio groups may soon follow. “We are in discussions with many different radio groups,” King confirmed.

Some air talent maybe unsettled by the idea of AI-powered technology, including the prospect that it could put some personalities out of work. But King thinks it will instead create new opportunities for radio talent, especially in smaller markets where a single person is wearing multiple on-air hats.

“Being able to provide more voices and more opportunities will create a better listening experience,” King said, while opening up practical benefits for talent. “If they were sick or they lost their voice, or for whatever reason, they still have that ability to be able to create content and continue to do their job.”

Radio talent may also see new money-making advertising opportunities in which their voice could be used to create campaigns. For marketers, that could mean new ways to get their brands in front of new audiences with content that is more personalized or localized.

Each synthetic voice audio file will contain a special audio watermark to alert Veritone that the voice heard is not real. But whether listeners are aware of that will be up to users of the technology. “We give best practice guidelines,” King said. “But at the end of the day, we want them to be able to use this how they see fit within their organization.”

Synthetic Voice Podcasts Go First

Synthetic voices on radio are unlikely to be widespread in the near-term. But it will only be a matter of months before iHeart begins using them to convert English-language podcasts to additional languages, starting with Spanish.

To create the podcasts in a new language, Veritone converts the English-language content into text, then translates it into the new language. That new text file is then recreated as a voice file, albeit in a different language. For talent that gives their consent, what makes it so remarkable is that the show is in their voice, giving the listener the same experience as the person who heard the show in its original language.

“On average, we need about three hours of content, sometimes more, sometimes less,” said King. “We can take that existing audio footage to create their own clone voices. And so it doesn't require anything more than getting the appropriate consent from the talent.”

Veritone is currently able to make synthetic voice content in more than 100 languages.

The Metaverse Next?

Veritone continues to develop new features and technologies, and while King cannot say what sort of upgrades are in the works, he can say they are working in a growing technological space, including the metaverse. That could create new opportunities for radio talent, one in which companies including iHeartMedia have staked a claim. Imagine, for instance, a virtual lunch where Ryan Seacrest chats about the latest hit records.

“If it's a new distribution channel, it’s going to be able to create a whole lot of opportunities for content creators, brands and consumers alike,” King said.

18 views0 comments


bottom of page