Veritone Voice FAQs

Return to Veritone Voice

GENERAL

What is Veritone Voice and who is it built for?

Veritone Voice is a hyper-realistic synthetic Voice as a Service (VaaS) solution that allows celebrities, athletes, influencers, broadcasters, podcasters and other prominent figures across industries to securely and ethically create, distribute, and monetize synthetic voices.

Veritone offers a custom synthetic voice cloning solution that allows users to securely create verified custom synthetic voices that can be created in many different languages.

In addition to custom voices, the Veritone Voice self-serve application enables users to create voice projects with stock voices from a library of over 200+ voices across more than 150 languages.

What benefits will Veritone Voice deliver for customers and prospective customers?

As a complete end-to-end solution for synthetic voice, Veritone Voice gives users a complete suite of voice capabilities including voice creation, management, licensing, rights and clearances, workflows, and monetization.

Here are a few industries that can benefit from Veritone Voice:

  • Advertising: Sports announcers, broadcasters, athletes, celebrities or influencers, can cost-effectively produce endorsements and advertisements at scale.
  • Broadcast & Podcast: Podcast hosts and distributors, broadcasters, TV/Film producers, dubbing studios, and other companies producing audio with human speech can connect with audiences in multiple languages, regions, and with personalized messaging giving the audience a more authentic experience.
  • Film and Studio Production: Craft professional voices from talent, automate voice retakes and voice overs, or localize and regionalize into foreign languages and dialects.
  • Government: Capture trusted voices of authority, including mayors, governors or other public officials, to engage with local or wider populations in the appropriate languages, dialects, and accents.

For more industries, please visit Veritone Voice.

What business problems are Veritone Voice solving for?

Advertisements & Endorsements: Sports announcers, broadcasters, athletes, celebrities or influencers, can cost-effectively produce endorsements and advertisements at scale

Accessibility & Extensibility: Synthetic voice reduces the time needed to produce audio endorsements, saving production costs and accelerating voice talents’ ability to monetize their brand.

Localization, Regionalization, Personalization: Podcast hosts and distributors, TV/Film producers, dubbing studios, and other companies producing audio with human speech can connect with audiences in multiple languages, regions, and with personalized messaging giving the audience a more authentic experience.

Audiobooks: Scale audiobook production with synthetic voiceover. With the ability to adjust pitch, tone, speed and more, it’s easy to create lifelike narration for audiobooks.

Voice Over for News, Financial, and Weather Reporting: Augment current talent and seamlessly add voice overs to news, financial, and weather reports as well as talent endorsements to broadcasts.

eLearning Courses and Training Materials: Produce and enhance learning materials for online courses and corporate training collateral for greater retention –– at scale.

Resurrect Deceased Talent Voice: Working with the estate and or IP owner, bring familiar and loved voices back to fans.

Emergency Announcements: Create and replicate trusted voices of local leadership for important news and announcements and easily translate into multiple languages in minutes.

Voice Over for Scene Narration: Insert and edit voice overs for game, film and TV scenes as well as license player or celebrity voices to further immerse and connect with your target audience.

Museums and other entertainment experiences: Enhance entertainment experiences with well known and beloved voices.

How is Veritone Voice unique from existing synthetic voice vendors?

Veritone Voice is the only synthetic voice offering that supports both text-to-speech and speech-to-speech modalities giving clients the ability to create voices for all of their voice projects. With Veritone’s VaaS solutions, Veritone Voice offers a comprehensive suite of integrated voice features including voice creation, voice management, voice licensing with rights and clearances, voice workflows, and voice monetization.

Veritone Voice is built on Veritone’s proprietary enterprise AI platform, aiWARE. For an additional fee, users can leverage these cognitive engines, such as translation and transcription and combine them with advanced automated workflows to deliver transformed audio, at scale.

How will Veritone protect against “deepfakes” and potentially malicious intentions?

Ownership of one’s voice and protecting their IP is critical. We want to make sure that we not only help our clients generate licensing opportunities but also ensure they have the necessary support to navigate rights and clearances. This will ensure their name, image, and likeness are only being used by approved parties that maintain high standards.

Veritone Voice safeguards include regulated processes and checkpoints to ensure proper rights, clearances, and pricing are followed. Added IP protection includes inaudible watermarks and proprietary tools to help ensure content can only be accessed after permission is granted.

The voice creation process includes both written and verbal consent verification. Once created, the talent has the right to approve all synthetic recordings. All created recordings include an inaudible watermark that Veritone can verify.

All voice training data and voice models are stored in a highly secure, proprietary digital asset management platform, ensuring the protection of your data.

Only authorized users will have access to create new clips, and all clip creation is tracked at the user level. The voice model code only works on Veritone systems and cannot be deployed anywhere else.

If at any time, the voice owner would like their voice clone deprecated, Veritone will destroy the voice model.

Will synthetic voice raise questions about authenticity?

For Veritone Voice clients, synthetic voice is a powerful tool that can be used at the complete control of the voice owners. Some clients may use synthetic voice for localization or limited to production editing, but Veritone Voice can also be used for complete end-to-end production. The voice owner has full control, who knows their audience best.

As a best practice, we recommend adding disclaimers so the audience is fully aware that they are hearing a synthetic voice.

TECHNOLOGY

What is Voice as a Service, and what capabilities will it deliver?

Veritone Voice is a hyper-realistic synthetic Voice as a Service (VaaS) solution that allows celebrities, athletes, influencers, broadcasters, podcasters and other prominent figures across industries to securely and ethically create, distribute, and monetize synthetic voices.

What is the difference between text-to-speech vs. speech-to-speech processes?

Text-to-speech (TTS) is the process of producing synthetic speech from a text file.

Speech-to-speech (STS) is the process of producing synthetic speech from an audio file.

 

What stock synthetic voices and languages are immediately available?

Veritone Voice offers a rich marketplace of over 200 stock voices that is immediately available to customers. Additionally, you may choose voices from a broad and diverse marketplace of genders, over 150+ languages, numerous accents, and stylize each voice so that it suits your needs.

What is required to create your own custom synthetic voice?

Custom voice creation is supported by our professional services team. To start, the voice talent or individual whose voice will be recorded and used to create a custom voice model must explicitly consent (verbal and written) to the creation of their voice model. If the voice talent is deceased, the estate as well as the IP owner if not the estate must provide explicit consent.

  • High-quality audio samples are required to generate a custom voice model.
  • Three hours of target voice is required for model training. This training data can be recorded from new scripts provided by Veritone, or repurposed from existing audio if quality specs are met.
  • Custom voice models can be built in approximately three weeks and once the model is built, synthetic voice content clips are available to produce. Pricing varies per voice model and package.
  • For text-to-speech usage, once the custom voice model is created, it will be enabled within your Veritone Voice account to create text-to-speech files.
  • For speech-to-speech usage, the requirements and process for creating a custom voice for speech-to-speech is similar to text-to-speech (see above) but it can also require the training of the source voice actor, who will produce the audio content based on the custom voice model.
  • The process for training the source voice actor is similar to the creation of the custom voice model which requires similar quantity and quality of training data. Note that a single source voice actor can drive multiple custom voice models and inversely, a custom voice can be driven by multiple source actors at scale.
  • If you do not have access to a source voice actor, one can be provided to you by Veritone for a fee.

Can I choose the specific voice engines to create my AI-generated custom voice?

Veritone Voice currently has access to market-leading voice engines that’s growing daily. A member from our professional services team will assist with the proper identification of these models based on use cases.  

How does Veritone Voice store the voice training data and voice models?

All voice training data and voice models are stored in a highly secure, proprietary digital asset management platform, ensuring the protection of your data.

Is Veritone Voice available on both desktop and mobile?

Veritone Voice is mobile-responsive and built for any browser on desktop and mobile.

At this time, Veritone Synthetic Voice does not have a mobile app.

PRICING & TERMS

What is the cost for subscribing to Veritone Voice?

Veritone Stock Voices

$500/mo USD

Plan Details:

  • Create unlimited projects
  • Produce as many text-to-speech voice clips with up to 1 million characters* per month
  • Access to over 200+ voice models across more than 150 languages.
  • Access to online customer support and 24/7 knowledge base.
  • Ability to request a custom workflow at any time (additional cost may be required)
  • Ability to create a custom voice at any time (additional cost may be required)
  • Month-to-month subscription with automatic renewal. You can cancel at any time.
*A character is any letter, number, space, punctuation mark, or symbol that can be typed on a computer.

Veritone Custom Voices

To build a custom voice to generate synthetic voice, the pricing will be tailored for each voice. Contact Us.

Veritone Synthetic Voice API

We do not allow public API access to Veritone Voice at this time.

PROTECTION & GOVERNANCE

How will Veritone protect my synthetic voice model and its usage?

Our team of experts works closely with you and your team to thoroughly define a master services agreement or platform licensing is determined.

The VaaS solution includes such features as inaudible watermarks, the automated inclusion of a copyright tone; traceability, the ability to track the components used to replicate your voice clips; licensing protocols, regulated process and checkpoints to ensure proper rights, clearances, and pricing are followed.

Once a synthetic voice model has been created, can anyone use it?

No. Veritone has built-in licensing protocols to ensure custom voices are only being used by approved parties that maintain high standards.

How will Veritone ensure synthetic voice is not misappropriated and that content rights are not violated?

For custom voice models, Veritone manages the model creation from end-to-end along with the production of audio files that use the model. All requests for synthetic content creation will come into Veritone Synthetic Voice and only be produced with prior audio and written approval from the voice owner.

What if I don’t want to have a custom voice model anymore?

Your voice is made into a code, and that code only works on Veritone systems. If you decide to stop using it, we destroy the code of your voice and provide receipt of destruction. It will no longer exist on our servers or be available anywhere, it will be deleted.

How does Veritone disclose to listeners that a synthetic voice is being used?

Working with the Open Voice Network, of which Veritone is a steering committee member, Veritone will adhere to best practices to protect consumers, and IP (voice) owners.

Depending on the application of synthetic content, the listener may or may not know it’s synthetic. For example, a celebrity authorizing the use of their voice model to fix a bit of audio in a movie, or if they use their voice on content in a foreign language rebroadcast with localized translations, the audio file might go without official notice.

It is a best practice to offer a disclaimer for consumers when synthetic voice is used for net new content particularly if a deceased voice is resurrected.

Consumer disclosures, in audio and/or visual, may be required when the voice model is being licensed and used for a paid endorsement or for government officials making public statements.

What steps is Veritone taking to ensure it’s following best practices when it comes to synthetic voice?

Veritone upholds a promise for good and is committed to working to address public concern and protect the intellectual property of the voice talent and advertising community. We will publish industry best practices and governance for synthetic content usage in public or commercial channels. In addition, Veritone is an active member of the IAB, the Open Voice Network, and other governing bodies as part of our efforts to develop global best practices for synthetic content.

Back to top  |  Veritone Voice