For all the talk about how AI is the future of data management, it can sometimes be hard to keep things in perspective. Yes, the future is bright; but it won’t arrive on its own. From industry to industry, and from company to company, everyone from data scientists to IT managers to line-of-business managers needs to be aligned on what it will take to turn that future into a reality.
Each company’s journey to achieving Everyday AI — including MLOps and a robust AI Governance practice — is unique. Larger companies with big, centralized data teams may have the resources to spend on AI but may also be weighed down by tech debt and legacy systems. Smaller, younger companies may be more nimble and ready to adopt new technologies, but may also lack the resources to scale their operations at pace.
Yet despite each company’s being unique in this respect, the challenges, pitfalls, and tricks to moving up the AI maturity curve are relevant to all enterprises, no matter their differences. This was a theme central to two of the talks given at this year’s Everyday AI conferences in London and Bengaluru. In this blog, we’ll summarize the insights shared by CJ Jenkins, Head of Data at Devoteam Creative Tech Sweden, and Padmashree Shagrithaya, Global Head of Analytics and Data Science at Capgemini at Everyday AI.
Learning to Scale
The challenge of scaling one’s AI operations represents one of the most important inflection points along the path to AI maturity. As Shagrithaya explained in her Bengaluru talk, the problem of scale arises when companies confront a familiar monster: complexity.
Companies that have already begun their journeys will have anywhere from a handful to dozens of models in operation across their workflows. In fact, in most cases these companies already run several complex processes, each of which requires the input of many models. Shagrithaya gave the example of sales prediction or demand forecasting at a company that makes many kinds of products. For a single stock-keeping unit, or SKU, the company will likely be running multiple models in order to predict sales in the coming months. But of course, such a company will have hundreds if not thousands of SKUs. The number of models required to forecast demand across the board — even if they are only tweaked versions of one another — multiplies accordingly.
The problem of scaling therefore becomes less about adopting complex processes, and more about wrangling all of the processes one has already adopted. New tech adoption should never cease, of course; but at the essential inflection point represented by the need to scale, the breakthrough is going to come in the form of chaos minimization. Simplifying, unifying, and aligning one’s existing processes: that is the way to drive scaling.
The biggest breakthrough today is not going to be about a single algorithm,... [but about] the ability to bring all of them together, to deploy them and run them together in real time.” — Padmashree Shagrithaya, Global Head of Analytics and Data Science at Capgemini
CIOs and CTOs who adopt this anti-chaos mindset will have prepared themselves to find meaningful answers to what Shagrithaya believes to be the five key questions that drive any company’s path toward scale:
- How do we reduce time-to-market from POC to production?
- How do we improve ROI with an Integrated AI Platform (like Dataiku)?
- How do we break down the silos between data scientists, ML engineers, and business teams?
- How do we govern ML models to minimize risk and ensure regulatory compliance?
- How can we retrain, recalibrate, and redeploy ML models?
Looking at these essential questions, Shagrithaya pointed out that no AI journey can be properly scaled without developing a robust MLOps component at some point or another. In fact, as she put it, MLOps is “the missing piece in the Enterprise AI puzzle.” As the audience members at Bengaluru learned, good MLOps discipline involved four components:
- Efficient Model Development: MLOps simplifies model deployment by streamlining the processes between modeling and production deployments.
- Model Monitoring: MLOps allows both production and AI teams to monitor models in ways specific to machine learning and is able to proactively monitor data drift, feature importance, and model accuracy issues.
- Production Governance: MLOps enables companies to minimize corporate and legal risks, maintain a transparent production model management pipeline, minimize and even eliminate model bias.
- Machine Learning Lifecycle Management: MLOps allows for a production model lifecycle management system that automates processes.
But of course, developing such a robust MLOps discipline is easier said than done, even as it is an essential part of any company’s long-term AI journey. This is something that CJ Jenkins knows a thing or two about. As Jenkins discussed at Everyday AI London, any firm’s ambitions regarding machine learning need to be tempered by appropriate expectations of the challenges to achieving MLOps discipline.
Failing Up
Failure is the backbone of innovation. That’s why, for the London crowd, Jenkins was bold enough to talk about all the ways she herself has failed at machine learning discipline, as a means of illustrating both what not to do, but also how failure itself brings the kind of insights that allow you to help others succeed.
Jenkins began her talk with a startling statistic: 80% of machine learning projects fail to make it into production. But why should that concern us? After all, we just said that failure is the backbone of innovation! Well, that’s true only if we learn from it. One needs to figure out why things failed so that they succeed the second time around, and so that we can bring that 80% down as low as possible.
For this very reason, Jenkins decided to present on what she has learned from her past failures. As we’ll now summarize, she came up with five things not to do in order to succeed at machine learning.
1. Know Your Product
“Building a data product (machine learning algorithm, dashboards, etc.) without closely working with the product teams is a waste of effort,” Jenkins said. If you aren’t in touch with your product teams, you won’t be aligned on the value added by any given project.
Jenkins told the story of a project she once led, which involved building a model that allowed her data team to see exactly where, geographically, a transaction had taken place. After spending time and resources building the model, her team presented the project to the product team, who hadn’t been consulted in advance, only to discover that the team saw no use or value in the project, from their perspective.
This kind of non-communication is a classic blocker to machine learning projects seeing the light of day and, consequently, to the development of good MLOps discipline within one’s data team. Rather than thinking about what you, as a data scientist, find interesting, she said, “think about how the data that you are collecting can better help [the product team’s] product, and then build that machine learning algorithm.”
2. Know Your Data
The second point that Jenkins stressed was that, in essence, data itself is meaningless without anyone there to properly understand it. “I think a lot of people have this misunderstanding that…you can just take a ton of data, throw it into the algorithm, and magic is going to come out,” she said.
It may sound simple, but it’s a requirement that’s often overlooked. You have to understand what data you have (and, equally important, what data you don’t have), what the distributions are, what the relevant factors are, and what the salient categories are. And your best way toward such understanding is, again, to speak to your product teams. They will know best, and they will help you see what models and features can be built around the data you have
3. Learn to Love Cleaning Data
What they don’t always teach you in school is that the data you work with in real life will, of course, be anything from clean when you get it. Trained data scientists are much like trained physicists in this way: the classroom examples are always frictionless, perfect, and without noise. The data you trained on was likely analogous: few missing values, nice distributions, meaningful categories, and so on.
Naturally, no data is like that in the real world. Jenkins pointed out that many of her colleagues who come out of masters degrees in data science are initially baffled by this seemingly obvious fact. Jenkins gave the example of trying to collate all the merchants listed on card purchase transactions across millions of cards and users. “I counted at one point that there were 68 different ways to say ‘Amazon’ [in the dataset],” she said. “And that was just one merchant.”
As a reflex, many (often younger) data scientists look at the chaos of messy data and conclude that they can never build a meaningful machine learning model on top of it. But this is a bad instinct: “you should never throw the baby out with the bathwater,” Jenkins said. You should hold it as a guiding principle that data should always be made clean wherever possible. In fact, you should yearn for nothing more than to clean it yourself.
4. Simple Is Better Than Complex
There’s no denying how exciting machine learning is. Every day, new potential is discovered and new powers are unlocked at the frontiers of data science research and practice. Jenkins sees this excitement all the time in her younger team members. They often join wanting to jump right into deep learning or build neural networks.
But the truth is that 90% of machine learning algorithms are not using deep learning models or neural networks. And while it’s fantastic to have the ambition to build intellectually challenging processes, it’s important not to overcomplicate things. In fact, more often than not, if a process can be done more simply, it is best to do it that way.
Jenkins gave an example from a past work experience at a fintech company. Her team had to develop an algorithm to determine whether any given one of their borrowers was going to pay the company back for the loan they received. “Yes or no? It’s a binary classification, and it’s pretty simple,” she said. And so she thought it best to use a gradient-boosted tree algorithm
But she had a manager who was really into neural networks, and who could build them really well. So she decided to build both and see which model made better predictions. “And it turned out that the gradient-boosted tree algorithm was better across every single metric,” Jenkins said. Had the team begun with the principle that the simplest solution was the best, they might have saved some valuable time. That said, the experiment itself was a lesson to help prove that principle.
“So don't just jump at the latest and greatest advancement. You might waste time, effort, and a lot of computational resources — neural networks are pretty expensive to train.”
5. Perfection Is the Enemy of the Finished
For all of the rigor and discipline needed to implement points one through four, Jenkins stressed that the goal for most models shouldn’t be perfection, for the simple reason that it can’t be. If you obsess over achieving perfection you may well never deploy a single model, which is of course the worst outcome.
So, a balance needs to be struck. Your code and algorithms need to be good enough not to cause problems, especially down the road. This means doing your due diligence forecasting likely problem scenarios and seeing if your existing model can weather and withstand them. But you don’t need to achieve perfection; you don’t need something so bulletproof that nothing could possibly cause its undoing, because you’ll never succeed at that.
Rather than it being perfect, it simply needs to be finished. You need to get it out the door so that you can actually test it on production data. If you never get it out the door, you’ll never know if it’s working. “Your production data isn't always going to look like you're training and testing data, and you're not going to know that until you get it in production,” Jenkins said.
So forget about perfection. Focus on the possible and the plausible. And, in Jenkins’s terms, “become a finisher.”