From prison phone calls and body-camera footage to voicemails and video depositions, modern technologies have introduced a new type of evidence to legal professionals: unstructured data. Specifically, unstructured media data, such as photos, videos, CCTV footage, zoom calls, and more.

Unfortunately, as the popularity of these new technologies has grown, so have the problems associated with them. In the legal arena, manually perusing unstructured media for that proverbial smoking gun now costs so much time and money in human manpower, that more than a few legal teams have simply agreed to exclude media files from evidence, altogether.

But with unstructured media files becoming so ingrained in our society, that’s no longer a viable option—especially not when you have whole cases that are being built exclusively upon such evidence.

Hence, modern attorneys need options. They need artificial intelligence (AI) software that can take unstructured data and adapt it into whatever format and language might be needed. One that maximizes the impact of important evidence during a case, and makes it easier to compile a comprehensive record.

That’s why we believe the AI legal revolution wouldn’t be complete without software capable of performing two vital functions: transcription and translation.

The Growing Problem of Unstructured Data

As we discussed in our last post, unstructured data refers to digital media that does not have a built-in support system. This includes almost every form of digital media you can think of, including videos, photos, website information, texts, emails, CCTV footage, weather reports, TV, data streaming, voicemail, and even most text documents.

Unstructured data is a huge part of our technology-driven world, allowing us to send and receive information with just a few clicks of a cell phone (to name only one such source). However, this accessibility and ease is exactly what’s turning unstructured data into such a huge problem for legal professionals.

Not only does this material need to be sorted, reviewed, and categorized during e-discovery, it also has to be transcribed—and with an estimated growth rate of up to 55-60% a year, attorneys are barely treading water in the fight to keep up.

Putting aside the other hurdles of document review, here’s a closer look at the specific issues that unstructured data is creating for transcription and translation.


First off, transcription. In the legal arena, a transcript is easily defined as a written report of an audio or video file. More specifically (in our case), we’re talking about audio or video evidence produced during e-discovery, whose words are typed into a text document with a particular, standardized format. This transcription makes it easier for legal teams to then search, categorize, and locate necessary information later on.

Audio transcriptions are important during many different phases of a lawsuit’s lifecycle, and are used by attorneys to:

  • Make digital information more accessible
  • Help visualize case strategies
  • Highlight important areas of a case
  • Prepare witness questions for trial
  • Strategize for appeals
  • Keep an accurate case record
  • Educate the next generation of legal professionals

In the past, attorneys have often outsourced these transcription duties to a transcription service, whose business model is to do all the heavy lifting of compiling speech to text documents at a reasonable price.

However, considering the exponential growth of unstructured data, this option is getting less and less financially savvy, particularly since English isn’t the only language that modern attorneys need to worry about. That brings us to our next area of concern: translation.


In many cases, it’s not just about turning audio into text, but about turning audio into text into English. Finding human translators with the necessary skill set to carry out such a task can be challenging—especially when you consider that it’s not just about finding someone with the right language proficiency.

When hiring outside help for a large-scale e-discovery project, law firms are required to hire translators who are licensed, ABA attorneys in good standing. This ensures the protection of sensitive information and keeps the integrity of attorney-client privilege intact.

With the introduction of AI translation (along with a little help from AI transcription software), tackling the transcription and translation of legal documents just got a little bit easier—no matter what language you’re working with.

Veritone Illuminate: The Best Software for AI Transcription and Translation

These days, law firms no longer have to stress about finding attorneys who meet a lawsuit’s niche language requirements—they also don’t have to spend gobs of cash hiring out a small mountain’s worth of transcription work, either.

With Veritone Illuminate‘s automatic transcription software, firms can now get speech-to-text audio transcripts at a near real-time pace. These can be generated from videos (like security footage or social media posts), audio files (such as a voicemail or 911 calls), and even other text documents.

Veritone Illuminate uses speech detection and recognition to assign labels and perform speaker identification to organize transcriptions, then Veritone Redact can be used to redact confidential or personally identifiable information from both audio files and transcripts alike. This information can then be exported to a firm’s application of choice, making it fully searchable, and easy to organize, filter, and review. And the benefits aren’t limited to English files.

And the benefits aren’t limited to English files, alone.

Veritone’s AI translation software is designed to recognize and react to over a hundred different languages—and not just formal speech patterns either, but casual slang, as well. These enhanced capabilities allow firms to automatically generate both an English and an original language transcript from the same file, all without the need for a specialized team of foreign language attorneys.

In short? With the twin superpowers of AI translation and AI transcription software, attorneys are now fully equipped to handle the growing amount of unstructured data, no matter what language those 1’s and 0’s stack up to be.

Veritone Illuminate for Images

To an ancient civilization, the mere existence of something like unstructured data would probably have been considered nothing short of sorcery. To a modern citizen, however, things like audio and video files are just another incredibly extraordinary part of ordinary life—as are the relatively new introduction of automatic transcription and AI translation.

However, what good are all those translations and transcriptions for a legal team, if attorneys still have to go back and visually comb through all the images associated with said videos? (We won’t lie—just thinking about that is making our eyes water…).

Luckily, Veritone Illuminate also offers image classification and Veritone Redact can identify and redact specified objects, words, and human heads—all of which help legal teams achieve a faster, more efficient, and more accurate workflow.

To learn more about Veritone Illuminate, Veritone Redact, and the rest of the Veritone Evidence platform, reach out to a team member today.