Upon witnessing the success of DevOps, the analytics community quickly realized the opportunity for a similar application of practices into analytics work — enter MLOps, a way that data teams and analytics teams alike can benefit from the automation of model deployment and integration. Organizations now face the new challenge of defining and navigating this wellspring of opportunity.
Aimpoint Digital is a Dataiku partner that supports pre-sales to post-sales enablement across a variety of client environments. Offering custom solution development through a variety of industry verticals, Aimpoint Digital organizes their offerings into three main business segments: data analytics, data science, and data engineering/infrastructure. Working through the entire data analytics lifecycle, with leading experts in an array of industries (i.e., retail, manufacturing, life sciences, etc), Aimpoint Digital is focused on implementing usable solutions across value chain functions.
During the Dataiku Product Days session which is overviewed in the blog, Aaron McClendon, Head of Data Science at Aimpoint Digital, and Caroline Osuna, Head of Data Engineering at Aimpoint Digital, help us understand the concept of MLOps while honing in specifically on MLOps automation and deployment in manufacturing.
MLOps is the practice of automating the deployment, integration, and monitoring of machine learning (ML) models, and this automation is crucial to increasing the speed at which organizations are able to release models into production. MLOps also involves ensuring the continuous quality and dynamic adaptability of projects throughout the entire model lifecycle. The goals and functions of MLOps can be grouped by four general principles.
Aimpoint Digital's 4 Key MLOps Principles
- Automation: Facilitating the process of building pipelines from the start of feature creation all the way to model training and deployment
- Reproducibility: Ensuring idempotence of model results through versioning and infrastructure as code
- Monitoring and Alerting: Monitoring model performance and the quality of predictions along with configured alert notifications
- Testing: Implementing tests that validate data, models, and their applications while enforcing governance policies
Speeding up time to insight through continuous integration, streamlined deployment of modular components, and automated model retraining and redeployment is highly beneficial for the quickly-changing production environment. Data security concerns and evolving compliance policies, a constant pain point for manufacturing organizations, can be addressed with automated retraining which easily verifies the adaptability of models. Essentially, effective MLOps systemizes the process so that adapting to changing environments is easier.
Real-time checks and rapid updates introduce reliability to processes. MLOps involves detecting model drift and/or changes in distribution underlying data for business-critical processes and confirms that data-driven decisions are supported by quality-monitored models. High availability and reliability of models is also checked on a real-time basis through templated infrastructure which can be, importantly, easily traced for regulatory purposes. This differs from traditional data science approaches which require complex, moving pieces and hefty software to deploy and monitor each model, creating commonly convoluted histories.
Furthermore, MLOps can be used in manufacturing to immediately notify operators if something is wrong within models, preventing risk-prone situations. For example, implementing MLOps at an organization means that you will be able to detect model drift, an issue that can cause productionized ML models to stray when there are distributional changes in the underlying data. Operators are then able to retrain models and keep services within SLAs before issues impact results, ensuring reliability for reproduced and versioned models as well.
A Manufacturing Use Case From Aimpoint Digital
Aimpoint Digital's approach to adopting AI in manufacturing revolves around awareness and avoidance of common pitfalls. Here are the principal components of their approach:
- Custom solution design means that gathering and reviewing current documentation and collaborating with operators to track major pain points is at the front of all processes. Immediately assessing and developing solutions is transformed to be common practice.
- Data harmonization involves identifying outliers and utilizing suitable handling techniques to map correlation and causality, understand relevant features, and champion feature engineering.
- Solution configuration requires checking data consistently for seasonality, time dependency, and imbalance within target distributions. Key solution configuration also involves experimentation and the analysis of output quality for both current and future production.
- Deployment and automation entail schedule setting on the basis of model training and the creation of clear model deployment pipelines that utilize cutting-edge versioning and control techniques. Automated reporting, detailed model performance, drift detection, variable impact mappings, and relevant statistics around input data are also key factors in regard to development.
Aimpoint Digital's Key MLOps Principles (Using Dataiku)
Automation: Clean architecture with a distinguished separation between design, deployment, automation, and API nodes with Dataiku makes automation seamless. Hadoop, Spark, Impala, Kubernetes, and other systems are easily integrated into the platform along with automated actions through scenarios that can be triggered both manually and conditionally.
Reproducibility: Built-in version control along with data and computation engine agnostics allows for optimal interoperability, yielding results that are easily replicated.
Monitoring: Dataiku sustains integration through multiple reporters such as Slack, Microsoft Teams, and email, as well as supports efficient visual monitoring across multiple dashboards.
Testing: Built-in and custom metrics allow for easy check-ins on data and model quality and effective audit trail following.
In accordance with Aimpoint Digital's approach, MLOps ameliorates the state of current practices by introducing four main principles to analytics processes: automation, reproducibility, monitoring, and testing. The advantages brought on via these principles are evident across all industries but are particularly noticeable in rapidly evolving environments such as manufacturing. Ultimately, following the path of organizations such as Aimpoint Digital and integrating a data science and AI platform like Dataiku, which bolsters MLOps, will prove beneficial for many.