People don’t normally have too much difficulty telling a tiger apart from a lion - in fact, your average three-year-old can tell them apart. But how about an algorithm?
A tiger cub and a lion cub, presumably before they can classify themselves.
While tigers are brightly striped and lions are a drab brown color, when you think about it, they still look pretty similar. They’re both big cats with features shared by all felines: round faces with short noses, tails, ears, and claws. How could you possibly build an algorithm that could tell a tiger from a lion?
The answer is that you rely on publicly available pre-trained deep learning models to begin, and then let Dataiku’s new deep learning plugin do the rest of the work. Let’s dive in, shall we?
How the Dataiku deep learning plugin works
The way the Dataiku deep learning works is pretty straightforward:
- First, add your images into Dataiku. In this case, you would use a image set comprised of photographs of lions and tigers.
- Then import a pre-trained model via the plugin. In this case, we use one of the Imagenet pre-trained models.
- Using transfer learning, retrain the model with your images. This adds an extra layer to the neural network that is entirely customized to your images.
- Take a new set of your images and classify them using your retrained model.
Deep learning in Dataiku is as easy as 1-2-3. ("4" is your answer!)
In essence, what you’re doing is building upon the layers of the pre-trained model, which recognize, for example, shapes, lines, and other patterns, and add a layer that focuses its intelligence on just two possible outcomes: lion or tiger.
And the key benefit to you, as a user, is that it saves you the massive amount of time and computation required to build the pre-trained model, and you can do every single step without writing a line of code.
Under the Hood of the Plugin
Even if you’re not a Dataiku user, you can follow the way the plugin works in our sample gallery project, which requires no download and operates just in your browser.
We can start with the flow, which is the basic organizing view of all projects in Dataiku. As you can see, we start with the pretrained model, along with two sets of our lion/tiger images (one for training, one for classifying). We add labels to our training set, then we use the plugin to retrain the pre-trained model with this labeled training set.
Now, all we have to do is classify our remaining images -- and what do you know, we did pretty well!
If you want to dive in even deeper, check out our step-by-step guide to using the Dataiku deep learning plugin. If you want, you could even try to tell apart a sheepdog from a mop. Warning: it might be a little more difficult than the lion/tiger example.