05.3.18

My 5 favorite practical machine learning use cases

I’ve been helping to implement machine learning into a lot of businesses, startups, enterprises, products, services, and apps now for a few years and these are some of my favorite use cases for machine learning that I’ve come across (so far)!

Machine learning is exceptionally good at conducting repetitive tasks, finding patterns, and predicting outcomes. When implemented correctly, and used for the right use cases, it can reduce costs, save time, and open up new sources of revenue for companies. But it can also have positive benefits for society as it can scale things that normally would be too expensive and too difficult to do. Which leads me to my first favorite practical use case for machine learning.

Detecting Fake News

I know… I know… this is a big one. I’ve previously written about my efforts and ultimate partial-success in using machine learning to weed out fake news.

By IFLA — http://www.ifla.org/publications/node/11174, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=57084301

This is not an easy problem to solve, partially because fake news is a moving target. There are several different types of fake news; news that just simply isn’t true, news that is highly biased but not factual inaccurate, satire, misleading etc.

Theoretically, one could sort out what classes or categories fake news falls into, train a model on those categories, and then be able to predict or detect fake news in future content. The challenge is no longer the tech chops required to build the right models, but the work it would take to get the right training data. For example, you can use Classificationbox by Machine Box to create a fake news detector in a few minutes, as long as you have the sample data.

Fake news is a real problem, and part of the problem is that there aren’t enough humans to sit there and manually sift through every article to determine its genuineness. This is the kind of scale problem that machine learning is really good at solving.

Face Authentication

I think it is reasonable to argue over the true effectiveness and necessity of Apple’s iPhone X FaceID technology as compared to the finger print sensor method for authentication. Both are incredibly convenient, and pretty darn good at solving the problem of remembering and typing in passwords.

DAVID PAUL MORRIS/BLOOMBERG/GETTY IMAGES

What I particularly like about using face recognition to solve the use case of authentication is that it can be applied to almost any device without special hardware. Just about every computer or phone today has a camera in it, and a camera is all you need to do face recognition. It may be necessary, in some cases, to add extra hardware to verify that someone isn’t just holding up a picture of your face to fool the system, but I don’t think its an absolute requirement. First of all, there are ways around that. You can use some logic to detect movement, or track people’s eyes, or ask them to enter in a shorter pin to keep the speed of authentication high but to also provide a secondary method of verification.

But secondly, you may be able to have scenarios where attempts to spoof the system are unlikely or inconsequential. We have customers who are using Facebox to verify people taking tests, buying sandwiches, or entering a building. It still brings a lot of time savings, new sources of revenue, cost savings, and value to customers, since the instances where people would try to spoof it are so low as to not be worth sacrificing everything else to try and prevent.

Making document search smarter

This isn’t the sexiest use case for machine learning, but it is one of my favorites because its so simple and yet so powerful.

Let’s say you have hundreds of thousands of pages of text, e-mails, messages etc. and you want to make it all searchable. Well, what you certainly shouldn’t do it try and tokenize and then index every word in every document. This will quickly overwhelm your search engine, and performance will suffer.

Instead, use something like Textbox, where you can run all of your text through some Natural Language Processing to extract keywords and named entities. Index those results instead. Your search engine will perform infinitely better, and will have even more relevant results since you’ve already gone through a process of understanding the language and context in the source documents.

My co-founder David Hernandez wrote a bit about how to do this in practice with Elastic Search here.

Content Recommendation

No one really likes to be put into a box and thought of as someone who’s likes and tastes could be predicted, but whether we like it or not, we do, as a whole, act in predictable patterns. Finding those patterns can be tricky to do manually, but its a great task for machine learning.

You might not know why a certain cohort of your users choses to click on a news post about sports when they’re usually interested in science or why millennials keep trying to buy monocles from your online store. Thanks to machine learning, you don’t really have to know why. You just want to make sure you catch those kinds of trends, and exploit them to maximize engagement, revenue, clicks, views or any other metric.

For example, you can feed a tool like Suggestionbox all the data you have about users, give it some things to chose from, reward the model when people decide to click, buy or otherwise engage with something, and sit back and watch the machine learning model learn about users and their behavior on your site. Its quite astonishing.

The nice thing is that you don’t need to know what to look for ahead of time. That is what the model is figuring out for you. It may be that the age of the person matters, or the time of day, the location, their gender, their previous purchases or clicks, etc. that decides what they’re likely to do next, you just don’t know. But a machine learning model can learn those things on the fly, and make decisions based on that learning to maximize your desired outcome.

Visual Search

“A lot of the future of search is going to be about pictures instead of keywords” — @Pinterest CEO Ben Silbermann

I love visual search because it breaks us out of the need to have to describe things with text that are otherwise hard to describe. I may want to find a product, logo, or photograph that is similar looking to one I already have. Or I may see something on a site I really like and want to see more like it.

If you’re a writer on Medium, you’re probably very familiar with Pixabay and Unsplash. Great sources of copyright-free images. When browsing for images, sometimes you can only describe what you’re looking for by pointing to another image and saying “something like this”. I wish these sites did a better job of implementing a capability that can be found in Tagbox, where you can teach a machine learning model tons of images, and then ask it to find similar looking images given a new image. You can actually play with this capability here as a demo.

It really is true that a picture is worth a thousand words. And putting 1000 words into a search bar isn’t a good idea.

Visual Search with Fashion Dataset from Machine Box on Vimeo.

Machine Learning Use Cases

There are a lot of good use cases for machine learning out there. Machine learning is a great tool for experimentation. Training data is by far the most important aspect to implementing something worthwhile, and gathering a good training set is still very much an art as much as a science.

Experiment, experiment, experiment!