When and When Not to Use Deep Learning

Data Basics, Scaling AI, Featured Nancy Koleva

Deep learning is all the rage today, as major breakthroughs in the field of artificial neural networks in the past few years have driven companies across industries to implement deep learning solutions as part of their AI strategy. From chatbots in customer service to image and object recognition in retail, and much more, deep learning has unlocked a myriad of sophisticated new AI applications. But is it always the right way to go?

The outstanding performance of deep learning algorithms with complex tasks requiring huge amounts of data, combined with the increasing availability of pre-trained models on publicly available data, have made deep learning particularly appealing to many organizations in the past few years. However, this doesn’t mean that deep learning is the answer to all machine learning (ML) related problems. 

But how to tell when deep learning is necessary, and when it isn’t? Obviously, each use case is very individual and will depend on your specific business objectives, AI maturity, timeline, data and resources, among other things. Below are some of the general considerations to take into account before deciding whether or not to use deep learning to solve a given problem.

Discover more 2020 data science and ML trends in the full video report

1. How complex is your problem? What are you trying to predict or discover?

One of the main advantages of deep learning lies in being able to solve complex problems that require discovering hidden patterns in the data and/or a deep understanding of intricate relationships between a large number of interdependent variables. Deep learning algorithms are able to learn hidden patterns from the data by themselves, combine them together, and build much more efficient decision rules.

definitions of AI vs. machine learning vs. deep learning

Check out this blog post for a refresher on the difference between AI, ML and DL

Deep learning really shines when it comes to complex tasks, which often require dealing with lots of unstructured data, such as image classification, natural language processing, or speech recognition, among others. However, for simpler tasks that involve more straightforward feature engineering and don’t require processing unstructured data, classical machine learning may be a better option. 

2. How will you evaluate the success of your model? How important is accuracy vs. interpretability?

While deep learning reduces the human effort of feature engineering, as this is automatically done by the machine, it also increases the difficulty for humans to understand and interpret the model. In fact, model interpretability is one of deep learning’s biggest challenges.

When evaluating any machine learning model, there is usually a tradeoff to be made between accuracy and interpretability. Deep networks have achieved accuracies that are far beyond that of classical ML methods in many domains, but as they model very complex situations with high non-linearity and interactions between inputs, they are nearly impossible to interpret. 

On the contrary, due to the direct feature engineering involved in classical ML, these algorithms are quite easy to interpret and understand. What’s more, choosing between different classical ML algorithms or tuning hyper-parameters is more straightforward since we have a more thorough understanding of the data and inner workings of the algorithms.

3. How much time and resources are you willing to allocate?

Despite its increasing accessibility, deep learning in practice still remains a complicated and expensive endeavor. For one thing, due to their inherent complexity, the large number of layers and the massive amounts of data required, deep learning models are very slow to train and require a lot of computational power, which makes them very time- and resource-intensive. Graphics Processing Units, or GPUs, have practically become a requirement nowadays to execute deep learning algorithms. GPUs are very expensive yet without them training deep networks to high performance would not be practically feasible. 

Classical ML algorithms can be trained just fine with just a decent CPU (Central Processing Unit), without requiring the best of the best hardware. Because they aren’t so computationally expensive, one can also iterate faster and try out many different techniques in a shorter period of time.

4. Do you have enough quality labeled data?

Don’t think of deep learning as a model learning by itself. You still need properly labeled data, and a lot of it! One of deep learning’s main strengths lies in being able to handle more complex data and relationships, but this also means that the algorithms used in deep learning will be more complex as well. And to extract general enough complex patterns from data, you will need lots - read LOTS - of examples (much more than a traditional ML model) - typically millions of labeled data points for a classification task.

graph showing machine learning vs deep learning processImage Source

The lack of a sufficiently large corpus of precisely labeled high-quality data is one of the main reasons why deep learning can have disappointing results in some business cases. Organizations face a myriad of challenges when it comes to cleaning, preparing and labeling data, and these processes often take up a massive volume of time and resources from teams that could be better spent building the next machine learning models or pushing models into production. 

One way to address the cumbersome task of data labeling is through active learning (learn more in this white paper), but in cases where having very large amounts of efficiently labeled data is not possible (or necessary), opting for more classical ML algorithms and models may be a better use of your data teams’ time, effort, and resources. 

Recap: To DL or not to DL?

hamlet play to be or not to be

Finally, taking all these considerations into account, here is a (non-exhaustive) breakdown of when you should use classical machine learning, versus when deep learning will be more appropriate:

Classical Machine Learning:

Deep Learning:

  • Interpretability and explainability are paramount
  • Smaller amounts of relatively simple data
  • Straightforward feature engineering
  • Limited computational power
  • Limited time, need for faster prototyping and operationalization
  • Need for varied algorithm choices
  • Accuracy of test dataset results is acceptable 
  • Very high accuracy is a priority (and primes over straightforward interpretability and explainability)
  • Large amounts of precisely labeled data
  • Complex feature engineering
  • Powerful compute resources available (GPU acceleration)
  • Augmentation and other transformations of the initial dataset will be necessary

Whether you decide to go with classical machine learning, or deep learning for a certain data project, a common thread is that there should be human oversight, evaluation and decision-making involved at every step of the process. This human-in-the-loop augmented intelligence is the key to truly responsible and transparent AI. 

You May Also Like

Why You Should Be Using Apache Spark + Kubernetes To Process Data

Read More

Is Yann LeCun the New Marie Curie?

Read More

The Many Flavors of Gradient Boosting Algorithms

Read More