So far, our discussions about the AI legal revolution have revolved around the many ways artificial intelligence is helping to combat the unstructured data crisis looming over the legal industry. However, one area we’ve yet to address, is what to do about all the personally identifying information (PII) that’s often contained within unstructured media files.

Unlike in days gone by, redacting unstructured data can’t be accomplished by hand with a box of good sharpies. Instead, you need a computer program. An automated redaction software that’s not only capable of editing sensitive text from documents, but that’s also equipped to handle faces, objects, and information inside unstructured media. One that’s time sensitive, cost-effective, and can be fully customized to meet the fluctuating needs of any compliance request, all without damaging any original files.

Until recently, getting all of that in a single platform has been a tall order. However, with the introduction of Veritone’s automated redaction software, attorneys can now have their cake, and eat it too.

In this article, we’ll talk about what PII is, how unstructured data has expanded its scope, and discuss all the ways in which AI can both better protect individual privacy, and streamline the e-discovery redaction process, all at the same time.

Personally Identifiable Information: A Really Important Mouthful of Words

First off, let’s talk about personally identifiable information, and why it’s so important to protect.

The phrase, itself, is a mouthful (unfortunately, not of the cake variety), which is why most people simply shorten it to its acronym, PII. According to Homeland Security, PII is defined as any kind of information that either reveals the identity of a person, or else gives enough hints for someone to infer that identity.

PII is roughly divided into two categories:

  1. Sensitive PII
  2. Non-sensitive PII

Here’s how they differ from each other.

Sensitive PII vs. Non-sensitive PII

As you might already have inferred, sensitive PII is the big no-no. This type of information is pretty obviously personal, and—according to the Federal Rules of Civil Procedure (along with the rules of most states)—law firms are required to redact it before making documents public. This includes things like a person’s:

  • Home or email address
  • Cell phone or landline number(s)
  • Social Security Number
  • Passport or immigration information
  • Financial account numbers
  • Medical records
  • Photos and videos (particularly of the face, or other identifying features)
  • Biometric data (such as retina scans, voice signatures, or facial geometry)

On the other hand, you have non-sensitive PII. This category includes things that don’t quite spell out your name and address with a flourish, but definitely flirt a little too close to home for complete comfort. We mean information like your:

  • Birthday
  • Race
  • Gender
  • Zip code
  • Place of birth
  • Religion

While these things aren’t as invasive as sensitive PII, non-sensitive PII might still need to be redacted, if it’s combined with other specific identifiers.

For example, if I had a document that read: “The following schedule is the personal itinerary for that one person who runs the country, and lives in the big, white house in Washington D.C….” you’d obviously know who I was talking about, even without me having to actually reveal any sensitive PII.

Hence, it’s sometimes necessary for firms to flag and redact even non-sensitive information, before disclosing certain documents.

Failure to Redact PII

The consequences for not redacting PII in discovery can range from embarrassing, to expensive, all the way up to civil liability, and—depending on the circumstances—even criminal charges. And that’s just on the home front.

Abroad, EU law makers have thrown down a pretty serious PII detection gauntlet, in the form of the General Data Protection Regulation (GDPR), which was implemented in 2018. Under these regulations, even companies located outside of the EU face painfully steep fines for the unauthorized release of an individual’s personally identifiable information.

Bottom line? Redacting PII in discovery isn’t something you want to mess up.

That being said, for the modern attorney, PII redaction can be a lot more difficult than it might sound. Especially when you consider that legal professionals are dealing with much more than just text documents, but unstructured media files, too.

The Challenges of PII Redaction During E-discovery

The inevitable side effect of the unstructured data boom is that capturing a person’s PII is as easy as snapping a photo or pressing record. Because remember, PII is not limited to just text, alone. Sensitive PII also includes facial features and biometric data—the kind we unintentionally collect every time we make a recording in a public place.

With PII lurking within the background pixels of untold numbers of unstructured media files, the challenges of e-discovery redaction have become a new kind of monster altogether. A time-consuming, expensive task, with a high likelihood of human error. One that requires attorneys to laboriously scour individual photos, videos, and audio files, searching for faces, objects, and words to blur, mute, and obscure by hand.

Considering that, on average, every attorney wastes over eleven hours per week on document review related problems, it’s no wonder that attorneys on both sides of the aisle are simply agreeing to leave out this evidence altogether.

However, anyone who’s been paying attention to current technological trends knows that this is fast becoming a non-option.

Around the globe, the number of cases relying almost exclusively on unstructured data are climbing, so it’s no longer enough for AI solutions to offer text redaction, alone. Instead, attorneys need a program that comes equipped with automated redaction software. The kind of redaction machine learning algorithms that can not only find and redact PII in text, but in audio, photo, and video files, too.

That’s why we’re pleased to introduce Veritone Redact’s fully automated redaction software, which is capable of protecting PII in whatever form of unstructured media it might appear—just as prescribed!

Veritone’s Automated Redaction Software: the AI-Powered Solution

With Veritone Redact, protecting the personally identifiable information of individuals is now easier than ever before—more importantly, it’s more effective.

Attorneys who take advantage of this program will have access to a myriad of different PII detection tools, which can help them expedite e-discovery redaction, curb costs, save valuable resources, and decrease the likelihood of human error.

Here’s a closer look at some of these tools.


The ability to search for faces, user-defined objects, and even spoken words and phrases contained within video and audio evidence. Files can then be flagged and marked to make subsequent searches much easier.

Define and Live Track

The ability to select a specific object inside a video (such as an ID card, notebook, face, or any other user-defined object), which can then be marked and automatically tracked throughout the rest of the video. Live tracking can then mark the zone of redaction with nothing more than a click of a cursor.


The ability to automatically blur out faces, user-defined objects, and even distort spoken words and phrases contained within video and audio evidence. These areas can then be flagged and marked to make subsequent searches much easier.


The ability to generate a comprehensive report of all the actions taken against redacted video or audio evidence. This allows attorneys to quickly and easily meet compliance requirements with a chain of custody support.

Search Audio

The ability to quickly locate words and phrases spoken within audio evidence (such as interview room recordings, 911 calls, or body cam audio), and organize these clips with a keyword search. All audio redaction includes a written transcript for added visual support and accuracy.


The ability to download redacted videos, audio files, and ‘action taken’ reports to a local computer, with no more effort than it takes to click a mouse.


The ability to manage digital evidence and redaction workloads, by tagging audio or video evidence with a ‘current review status.’ This, in turn, makes workflow more efficient, and progress tracking much simpler.

Legal AI: The Power to Redact, and The Power to Predict

Protecting confidential information is one of the biggest concerns with any large-scale e-discovery project. Luckily, not only can Veritone redact PII within text, it can handle the complexities of unstructured media, as well. And—when used in conjunction with Veritone Illuminate, an AI-powered early case assessment tool—the modern attorney is not only capable of tackling the biggest hurdles in document review, they also have an added superpower…

They can tell the future.

With AI algorithms that can recognize and analyze patterns within evidence, legal professionals can now receive early case insights, which can help predict a lawsuit’s optimal direction, all before attorneys even know what they’re dealing with.

And if that isn’t telling the future, we don’t know what is.