While it is data executives who ultimately purchase a data science and machine learning platform to help teams deliver business value, they are unlikely to be day-to-day users. The users, typically more technical data practitioners, are the ones who deliver use cases and generate tangible value from the platform, justifying its cost. So what exactly are the benefits of using a platform for these users? To illustrate them in practice, we’ll outline three principles of scalable, enterprise-grade AI: moving from individual benefits to a team sport, thinking of analytics as a product, and building to scale.
1. Moving From an Individual to Team Sport Mindset
If data practitioners have only ever done data science in an individual silo, it can be challenging to reframe that mindset. In soccer, for example, a good player is open to the idea that when someone passes her the ball, the best way for the team to score might be to pass it on to someone else instead of trying to score herself! Following that logic, even if a model is first developed by one individual, it is best improved by another team member who will challenge assumptions, unlock new ideas, and improve robustness — these benefits are what platforms like Dataiku create for users. Without a platform, much of peer feedback will focus on narrow technical implementation details, instead of more business value.
From there, the team can decide together what standards must be met to take collective ownership of each model. In practice, this means that when something breaks in production, anyone can fix it, avoiding the exposure created when one specific person is on the hook to fix each model, but is busy on vacation. Taking this principle even further, if different teams across the company (and not just the data science team) align on standards, total resilience increases because people can more easily rotate around teams by plan, or in emergencies. And by encoding these standards in a platform, you avoid data science “busy work,” keep data scientists engaged and excited by the projects they're working on, and satisfy their need to push their capabilities to new levels.
2. Thinking of Analytics as a Product
Treating every model the team builds as a data product means the steps taken (from raw data to training, deployment, and retraining) should be clearly understood by any consumer of the product. Not only does this help drive transparency amongst team members, but it can be immensely helpful from an audit and compliance perspective, especially in highly regulated industries. This transparency in turn helps the team accept responsibility to investigate errors themselves first, before throwing up their hands and bringing in IT. The more a model is treated as a self-contained product, which takes responsibility for its end-to-end process, the more team members can succeed with data science as a team sport (with more impact and reliability). Platforms abstract away common underpinnings of individual data products, ensuring more time is spent building and improving models.
Ideally, models-as-data-products will even proactively communicate issues to their consumers. Over time, the goal is that every model becomes a product fully supported by IT, with specific SLAs agreed for every model (whether for internal or external consumption), granting these products appropriate weight in IT decision making. By taking a platform approach to this product scope, team members can easily spot and diagnose when data is drifting and determine how to react and communicate to stakeholders, without having to reinvent their response every time.
3. Building to Scale
Even without any platform, expert data scientists may eventually deliver enough products to realize the short-term benefits of standardization, e.g. shared code environments, CI/CD pipelines, and reusable data assets. They may even commit to invest in refactoring existing products to reduce technical debt, which has a long-term benefit: when they leave, their work is future-proofed and doesn’t send the company into a spiral of things breaking. It stands to reason that those with prior experience solving these challenges will best understand the opportunity cost of building all that from scratch, instead of investing in a platform to deliver those benefits immediately.
Platforms can help these expert data scientists take another step to scale the impact of data and AI in their organizations, even beyond their tenure. That is to realize they will never get to all the use cases they want to, even if they had a perfect platform for their needs. Platforms with a broader constituency, which upskill non-experts to contribute and co-build data science projects, lower the barrier to entry and onboarding time for less expert people and enable more business value to be delivered without hiring more experts. In a best case scenario, expert data scientists will work with these newly involved teams (i.e., business SMEs) to define standards relevant across all skill levels, so everyone can contribute to business ambition and new hires are onboarded in days not months. Not only does this increase the pool of people involved in data science and AI projects, it enables the experts to focus on increasingly ambitious, high-value projects.
Platforms Alter the Economics of Data and AI
According to a study published in the MIT Sloan Management Review, “A good rule of thumb is that you should estimate that for every $1 you spend developing an algorithm, you must spend $100 to deploy and support it.” The study, titled “Getting Serious About Data and Data Science” by Thomas Redman and Tom Davenport, nicely illustrates how the three aforementioned principles change the economics of data and AI.
Cultivating a team sport mindset and adopting a product scope approach are about reducing the costs of AI: For each $1 you spend developing a product, spend only $50 or $10 deploying and supporting it (e.g., by avoiding unexpected costs of hiring experts to fix someone else’s model not built to product-level quality). Building to scale, though, means not accepting that the $101 budget should be spent on just the model at hand, but that part of it should be reinvested to grow the human capital available to the organization in the future, without having to grow headcount. To truly scale AI you need to do both: reduce costs per product AND grow the budget for products.