Return to


What is Veritone and who is it built for? is a hyper-realistic synthetic Voice as a Service (VaaS) solution that allows anyone to create, manage, share, license, and monetize their professional-quality synthetic voice and easily convert it into different tones, dialects, accents, and languages. is designed for any company or person who places value on voice or the spoken word. is immediately available for media companies, production studios, brands, celebrities, and influencers to accelerate content creation, automate voice production, and further monetize voices. Although this exciting technology is poised to transform the media and entertainment industry, there are numerous other markets it will impact including criminal justice, public safety, legal and compliance.

What benefits will Veritone deliver for customers and prospective customers?

As a complete end-to-end solution for synthetic voice, gives media companies, production studios, brands, celebrities, and influencers a complete suite of voice capabilities including voice creation and usage, management, licensing, compliance and clearances, workflows, and monetization.

Synthetic voice is an exciting and disruptive technology with numerous positive commercial applications. Here are a few industries that can benefit from

  • Advertisers & Agencies: Enhance advertisements with well-known voices and quickly turn around influencer endorsements.  Automate production efforts, and streamline voiceovers.
  • Broadcast TV and Radio: Seamlessly add voice-overs to new reports and podcasts as well as talent endorsements to broadcasts.
  • Film and Studio Production: Craft professional voices from talent, automate voice retakes and voice overs, or localize/regionalize into foreign languages and dialects.  Furthermore, can replicate or resurrect actor or actress voices while adding or editing dialogue as needed.
  • Government: Capture trusted voices of authority, including mayors, governors or other public officials, to engage with local or wider populations in the appropriate languages, dialects, and accents.
  • Law & Public Safety: Create and replicate trusted voices of local leadership for important news and announcements, and easily translate into multiple languages in minutes.

For more industries, please visit

What business problems are Veritone solving for?

Media companies, production studios, celebrities, and influencers face limitations in terms of how much content they can produce—people can’t be everywhere at once. With, media entities and influencers can increase their content output, both in native language or foreign, so they don’t miss any opportunities to monetize or optimize their voices. also helps reduce the cost associated with localizing content so brands can use fewer resources to tailor their message for a particular region or translate it into different languages or dialects.

How is Veritone unique from existing synthetic voice vendors?

Rather than focus on one aspect of synthetic voice creation, offers a comprehensive suite of integrated voice features including voice creation, voice management, voice licensing with compliance and clearances, voice workflows, and voice monetization. With customizable workflows, helps streamline and automate processes to make it easier to create, manage and share voice content collections.

Further, is built on aiWARE, the first OS for AI built by Veritone, making the user experience consistent and connected across existing Veritone applications. This means they’ll have access to evolving best-of-breed voice engines and be able to combine them with other cognitive capabilities like transcription, translation, sentiment analysis and more.

How will Veritone protect against “deepfakes” and potentially malicious intentions?

Ownership of one’s voice and protecting their IP is critical. As a core tenant of Veritone’s M&E businesses, we want to make sure that we not only help our clients generate licensing opportunities but also ensure they have the necessary support to navigate compliance and clearances. This will ensure their name, image, and likeness are only being used by approved brands that maintain high standards. 

Our synthetic voice safeguards include regulated processes and checkpoints to ensure proper rights, clearances, and pricing are followed. Added IP protection includes inaudible watermarks and other tools to help ensure content can only be accessed after permission is granted. And proprietary content claiming tools help protect everyone’s voice and give users the ability to place claims against unauthorized monetization of their content on social platforms.

Will synthetic voice raise questions about authenticity?

For content creation, synthetic voice will be an accelerator and differentiator, giving companies and individuals a faster, more cost-effective solution to expand and extend content. 

For brand endorsements and influencer marketing, synthetic voice is a powerful tool that can be used at the complete control of the voice owners.  Some influencers may use synthetic voice for localization (foreign languages) or limited to production editing, but can also be used for complete end-to-end production.  Again, the influencer has full control, who knows their audience best. 

Many audiences are already fully embracing and engaging with synthetic content, but with the rise of synthetic voices in devices and smart assistants, it’s becoming more common, especially with millennials and Gen Z who are the fastest adopters of synthetic-voice-enabled devices. And with the synthetic voice market expected to grow to reach $26.8 billion by 2025, synthetic voice is becoming more and more widely used and accepted by the general public.

To further guarantee authenticity, Veritone will verify that all synthetic voices are created by the  original owner. With such features as inaudible watermarks and other built-in capabilities, owners can take action within if their voice is being used without permission.

How and why did Veritone choose the name

The word “marvel” means to be amazed and filled with wonder. To Veritone, the ability to create and replicate someone’s voice synthetically through AI from audio samples is a powerful feat in itself. But when you combine that with the ability to extend, scale, and monetize created voices with new use cases and mass market demand, that’s just simply amazing. With our heritage in AI, audio, licensing and advertising, coupled with our expansive ecosystem of media customers, we feel we’re well-positioned to fascinate the maintesteam with synthetic voice, driving wider adoption.


What is Voice as a Service, and what capabilities will it deliver?

Voice as a Service (or VaaS), from, is a SaaS solution that allows anyone to create, manage, share, license, and monetize their professional-quality synthetic voice and easily convert it into different tones, dialects, accents, and languages.

What is the difference between text-to-speech vs. speech-to-speech processes?

Text-to-speech (TTS) is used by converting digital text including markup language to audible synthetic speech by selecting professional-quality synthetic voices (across gender, languages, accents, and more) or creating AI-generated custom voice models through audio samples. 

Speech-to-speech (STS) can generate a more dynamic AI-generated custom voice that approximates real human speech patterns better than text alone. However, the process is more involved as it requires audio samples to build and train both the custom voice model and the source voice model (think puppet and puppeteer). 

What is required to create your own custom synthetic voice?

Currently within a account, users can create synthetic content by choosing professional-quality synthetic voices in our marketplace. Creating a custom synthetic voice will require the support of the Veritone Professional Services team. The process will require high-quality audio samples from either live recorded readings or archives to generate the custom voice model. 

On average, custom voice models can be built in approximately three weeks and once the model is built, synthetic voice content clips are available to produce at scale with a variety of subscription plans to meet customer’s use cases.

What professional-quality synthetic voices are immediately available? How many languages? offers a rich marketplace of over 200 professional-quality voices that is immediately available to customers. Additionally, you may choose professional voices from a broad and diverse marketplace of genders, over 60+ languages, numerous accents, and much more all content projects.

What is required to create a custom voice for text-to-speech process?

To start, the voice talent or individual whose voice will be recorded and used to create a custom voice model must explicitly consent to the usage of their speech data. If the voice talent is deceased, the estate must provide explicit consent.  

Next, speech data can be provided in several ways: It can be provided as recorded audio utterances and mapping scripts to training a voice model, or it can be provided as archived audio files. The quantity and quality of the training data that is provided will result in the quality of the custom voice. 

Once the custom voice model is created, it will be enabled within your account to create text-to-speech files. 

What is required to create a custom voice for speech-to-speech process?

The requirements and process for creating a custom voice for speech-to-speech is similar to text-to-speech (see the previous question) but it also requires the training of the source voice actor, a professional or non-professional, who will produce the audio content based on the custom voice model. The process for training the source voice actor is similar to the creation of the custom voice model which requires similar quantity and quality of training data. Note that a single source voice actor can drive multiple custom voice models. Also, if you do not have access to a source voice actor, one can be provided to you by Veritone for a fee.

Is Veritone built on aiWARE? Will customers gain access to all cognitive engines within aiWARE?

Yes, is built on Veritone’s proprietary aiWARE operating system. Currently, users  can leverage these cognitive engines, such as translation, sentiment analysis, and content classification, and combine them with multiple best-of-breed voice engines through our Professional Services team.

Can I choose the specific voice engines to create my AI-generated custom voice?

Yes, you’re able to choose a specific voice engine to create your custom voice. currently has access to market-leading voice engines that’s growing daily and a Professional Services team to assist with the proper identification of these models based on use cases.

How does Veritone store the voice training data and voice models?

All voice training data and voice models are stored in a highly secure, cloud-based architecture similar to what’s used for Veritone Digital Media Hub, ensuring the protection of your data.

Is Veritone available on both desktop and mobile?

Yes, is mobile-responsive and built for any browser on PC and mobile.


What is the cost for subscribing to, and what benefits does subscription include?

A subscription to gives users access to text-to-speech functionality so they can generate synthetic speech from text using a selection of professional voices for $199/month for 2 million characters which approximates to about 20 audio hours per voice. 

To build a custom voice to generate synthetic voice content based on text-to-speech and speech-to-speech processes, the pricing will be tailored for each voice. 

What is the cost to create a custom voice?

Pricing for a custom voice creation is based on numerous factors including voice creation model (text-to-speech vs. speech-to-speech). Our Professional Services team will work with customers to clearly define cost and timeline. 


How will Veritone protect my synthetic voice model and its usage?

To protect your synthetic voice, includes such features as inaudible watermarks (the automated inclusion of a copyright tone), traceability (the ability to track the components used to replicate your voice clips), licensing protocols (regulated process and checkpoints to ensure proper rights, clearances, and pricing are followed), and content claiming features (proprietary tools to protect and place claims against unauthorized monetization of your content on social platforms).

Once a synthetic voice model has been created, can anyone use it?

No. As a core tenant of Veritone’s M&E businesses, we want to make sure our clients are not only generating licensing opportunities but have the necessary support to navigate compliance and clearances. has built-in licensing protocols to ensure anyone’s name, image, and likeness are only being used by approved brands that maintain high standards.

How will Veritone ensure synthetic voice is not misappropriated and that content rights are not violated?

For premium and protected voice models, Veritone will control the model creation from end to end along with the production of audio files that use the model. All requests for synthetic content creation will come into Veritone and only be produced with prior audio and written approval from the voice owner. This approval process will cause a slight delay in time to market but it will still be much faster than the traditional studio model we live in today.

How does Veritone disclose to listeners that a synthetic voice is being used?

Depending on the application of synthetic content, the listener may or may not know it’s synthetic. For example, a celebrity using their voice model to fix a bit of audio in a movie, or if they use their voice on content in a foreign language rebroadcast with localized translations, the audio file might go without official notice.

Working with the Open Voice Network (, of which Veritone is a steering committee member, will adhere to best practices to protect consumers, and copyright (voice) owners. Consumer disclosures, in audio and/or visual, may be required when the voice model is being licensed and used for a paid endorsement or for government officials making public statements. Our goal is to protect the influencer voice brand as well as let the listeners know they are listening to a synthetic content file. We also want to let them know that the influencer has approved the use of their voice and has fully endorsed the product mentioned in the audio file. 

Further, Veritone will place inaudible fingerprints in the synthetic audio. If the file was manipulated in any way, we would be able to identify the original file and any modifications made without consent of the voice model owner. 

What steps is Veritone taking to ensure it’s following best practices when it comes to synthetic voice?

Veritone is committed to working to address public concern and protect the intellectual property of the voice talent and advertising community. We will publish industry best practices and governance for synthetic content usage in public or commercial channels as well as invest in a Synthetic Content Governance Consortium with thought leaders to standardize synthetic voice commercial use and protection. In addition, Veritone is an active member of the IAB and Open Voice Network as part of our efforts to develop global best practices for synthetic content.


Back to top  |  Return to