We’re thrilled to announce that Dataiku 7 is here! This release includes additional features for statisticians and data scientists as well as individual model prediction explanations for all. Ultimately, Dataiku 7 reinforces the notion of data-driven collaboration and empowers organizations with scalable, explainable AI.
Heads Up!
This blog post is about an older version of Dataiku. See the release notes for the latest version.
Key Features of Dataiku 7
Below you will find a roundup of the core feature highlights included in Dataiku 7, all of which aim to arm organizations to root their AI efforts in collaboration and explainability.
1. A Dedicated UI for Advanced Statistical Analysis
Statisticians can now use Dataiku to perform advanced statistical analysis in the familiar worksheet-and-cards format while collaborating with the rest of the wider data or analytics team. While the feature will significantly impact statisticians, it’s not exclusive. Its interactive interface allows statistics to be visualized by everyone — statistician or not — which, in turn, expedites the process of uncovering insights from the dataset and eliminates bottlenecks in AI project deployment.
2. Enhanced Git Collaboration
This feature enables data scientists or other coder-centric users to create, delete, push, and pull Git branches directly from Dataiku. As a result, coders can duplicate projects to sandbox changes and seamlessly merge changes back to the original project when they’re done, and have all of their changes captured in Git. The branching and merging capabilities drive stronger coder collaboration and make workflows more productive.
3. Individual Prediction Explanations
In an effort to interpret black-box models, we released two features in Dataiku 6 that help normalize explainable AI — partial dependencies plot and subpopulation analysis. To round out the trifecta, this release includes a feature to help identify which features are responsible for a given prediction being different from the average.
With prediction explanations, organizations can effectively debug black-box models for accuracy and bias by describing which characteristics or features have the greatest impact on a model’s outcomes. Dataiku 7 includes both row-level prediction explanations in output datasets as well as interactive visualizations of the rows of data. These explanations help inform the “why” behind complex machine learning models.
4. More Elasticity With Kubernetes
To expand on the managed Kubernetes cluster capability introduced in Dataiku 6, users can now run webapps on Kubernetes clusters, which supports more concurrent users and allows a fast, flexible execution backend to easily scale out AI applications.
5. Active Learning Plugin
Having intelligent labeling capabilities is a critical step in any successful machine learning project. With the human-in-the-loop labeling and active learning plugins from Dataiku 7, users can reduce the immense amount of time and effort associated with creating model training datasets.