Veritone Adds Five Dozen New Languages To Its Cloned Voice Library.


Want to turn your podcast into Persian, Burmese or Zulu? Each is now an option as Veritone has added 72 new stock voices to the Veritone Voice library. The company is expanding its menu as its artificial intelligence-powered platform begins to be used by podcasters to convert their English podcasts into new languages to expand their appeal to new markets.


In an update, Veritone says it has upgraded its text editor capabilities. Users will be able to edit breaks, pauses, and selected phonemes and prosody. They can also make adjustments to voice rate, pitch and volume, and create back-and-forth conversions between languages while having the ability to enhance the synthetic speech for a line of text or word by word. With the new “say-as” feature, users also gain more control over voice outputs. Vertione says that is critical in overcoming the roadblocks that prevent synthetic speech from sounding more natural and human.


To help creators expand their audience reach, 72 new stock voices have also been added to the Veritone Voice library, bringing the total close to 300. The latest additions include Amharic (Ethiopia); Bangla (Bangladesh); Persian (Iran); Filipino (Philippines); Galician (Spain); Javanese (Indonesia); Khmer (Cambodia); Burmese (Myanmar); Somali (Somalia); Sundanese (Indonesia); Uzbek (Uzbekistan); and Zulu (South Africa).


“It gives brands the ability to meet customers' needs with hyper-realistic localized voices and languages at scale and in a fraction of the time and cost it would take to try to create one from a human voice,” said Veritone Senior VP Sean King.


Since Veritone Voice launched in May 2021, more than two thousand customers have already generated thousands of synthetic voice clips using its technology. Last week, iHeartMedia announced it would use the voice-cloning product to translate podcasts into multiple languages, beginning with Spanish by mid-year.


“We envisioned a synthetic voice solution that would not only humanize digital content but also remove the limitations of where and how voice personalities could be securely and efficiently replicated and heard,” said Chris Doe, Head of Commercial Products at Veritone. “In less than a year, we’ve achieved just that with these added features, which help corporations, media companies, brands, broadcasters, podcasters and advertisers create revenue opportunities and satisfy increasing consumer demand for more natural, localized and personalized content.”


To create the podcasts in a new language, Veritone converts the English-language content into text, then translates it into the new language. That new text file is then recreated as a voice file, albeit in a different language. For talent that gives their consent, what makes it so remarkable is that the show is in their voice, giving the listener the same experience as the person who heard the show in its original language. The company says it needs about three hours of a host’s voice in order to clone it for new content, either in their native tongue or one of the range of other languages now available.


King said that each file that uses a cloned voice will include a special audio watermark to alert Veritone that the voice heard is not real. But whether listeners are aware of that will be up to users of the technology. “We give best practice guidelines,” he told Inside Radio. “But at the end of the day, we want them to be able to use this how they see fit within their organization.”

4 views0 comments