Building a private, local photo search app using machine learning
This is it. This is the best DAM thing I’ve ever done. I don’t normally like to brag but I’m so freak’n proud of myself for this one that I feel like I need to share it. They said it wasn’t possible (no one actually said that), they said it couldn’t be done (lots of people said it could be done), but I did it and it works GREAT!
I’ve had a problem ever since having kids. Well… many problems, but I’ll focus on the tech-related one. The problem is that I have 80,000 photos. That isn’t an exaggeration. Between my wife and I, after our first human was born, we went from around 3000 photos to 80,000 in a just a couple of years.
Very quickly, it became apparent (pun intended) to me that there was almost no point in taking photos anymore because it would forever be impossible to find them again later. At the time, I had all my photos in Dropbox. Dropbox didn’t have any sort of photo management capabilities (and still doesn’t as of this writing), and it was something I desperately needed. I knew, from my work in machine learning, that there’s so much you can glean from a photo, such as who is it in, where it was taken, and what sort of scene it is of; future super-villain terrorizes parents for example.
So I did the only logical thing. Ported over all 80,000 photos to Apple Photos so that some ML could be run and I could find my photos again. It worked brilliantly — the end. Bye!
False. It wasn’t the end. Now I’m stuck in Apple’s walled garden, as it were. I actually love the garden, there are lots of pretty flowers such as the iMaccius RhoddeNoCDrom, and the Ipaddus Expensivus. But not everyone does, and a post by my good friend Jaron Phillips reminded me that there’s always another way (aside from Google which has serious privacy issues).
Run it locally
Let’s discuss why this post is written and how I’ve solved this amazing problem in an incredible way (and how you can do it too, albeit in a far less interesting way).
Your photos are (probably) incredibly private and hold many secrets, so you may want to have them exist locally on your computer only. But then they’re not searchable! Let’s change that! For this process to work, they need to be in a folder, so just export them all into a single folder (if they aren’t already). If you’re good at computers, you can tweak my script to make it work for photos in many subfolders as well. But the key component is really the machine learning.
We’re going to use Tagbox, which has been pre-trained with thousands of tags for images, to tag all of our photos with what is in them, for example; ocean, sunset, beach, fog, dog, birthday cake, doom, etc. The great thing about Tagbox is that it runs locally on your computer. No cloud company stealing your secrets.
We’re going to store those tags INSIDE THE DAMN FILE. Imagine that?! Why are we doing this? So you can search. BAM! BING! OTHERSOUND! Isn’t that amazing?
These tags will go into the “comment” field that Spotlight will index so you can search on these files later.
Here is how I got this working in about an hour. First thing I did was move a bunch of photos to a folder on my computer, to simulate my photos directory of yore.
Then, I downloaded, installed, and ran Tagbox, which took a few minutes, but that is because its a product from my company and I know it well. If you’re a developer or technically inclined, this won’t take you long either.
Next, I wrote a Go script that iterates over the directory, posts each image to Tagbox, gets the tags back, and puts them into the comments field in the file.
Then I ran the script and voila — EXTREME EXCELLENCE.
Now I can search for any photo by what is in them on my Mac. No need to send my personal photos to the cloud, or to Google, or to… other weirdos. It ran locally, it ran quickly, and it ran EXCELLENTLY.
This is just the beginning — since you’re an amazing developer you can take this to the next level and build an actual application with other features, or perhaps exploring injecting the tags in other places such as the EXIF data or a separate xml sidecar file. I’m here to give you ideas, you’re here to do amazing things with them, because, let’s be honest, I’m a terrible developer.
Building a private, local photo search app using machine learning was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.