Dataiku 10 builds on core strengths related to model deployment and experiment tracking to bring additional tools to ML operators maintaining live models in production. Here, we'll highlight three key features that help make their job easier.
Enhanced Automation for Managing the ML Lifecycle
Putting a machine learning (ML) model into production is an important milestone, but it’s far from the end of the journey. Once a model is developed and deployed, the challenge becomes regularly monitoring and refreshing it to ensure it continues to perform well as conditions or data change. Since periodic spot checks are time consuming and often difficult to reproduce later, a more systematic approach is critical when monitoring dozens, hundreds, or even thousands of live models. To address these challenges, Dataiku 10 includes a suite of new built-in MLOps features including a model evaluation store, automatic drift monitoring and analysis, and a model comparison tool.
More Context for Deployment Decisions
The model evaluation store automatically captures historical performance snapshots all in one place, arming ML operators with the context they need to make decisions. With pre-built charts to visualize metrics over time and automated drift analyses to investigate changes to data or prediction patterns, it’s easier than ever for operators to spot emerging trends and assess model health.
During model development or routine maintenance cycles, manual experiment tracking and model comparisons can be quite tedious and time consuming for data teams. Visual model comparisons in Dataiku make this task a breeze, giving data scientists and ML operators side-by-side views of performance metrics, features handling, and training information.
Serve and Govern External Models
In some organizations, pockets of advanced data scientists are using open source platforms such as MLflow to programmatically test and deploy models. With Dataiku 10, they can now serve, monitor, and govern external models developed in MLflow within Dataiku’s MLOps framework. In addition to the benefits of centralized model deployment and oversight, bringing these models into the fold frees up data scientists to do data science, rather than worry about production code and ongoing model maintenance. IT operators can easily compare the performance of models built in external development environments with models natively built in Dataiku, and enjoy other capabilities like the visual interface, model interpretability aids, and what-if analysis that come with the Dataiku platform.