In the rap world, it's common for an artist to drop a mixtape in between proper albums as a sort of appetizer for the main course. And some of these mixtapes wind up being really, really good. The new Dataiku release (4.0.5) is a minor one, but it has some great features that should whet your appetite for our upcoming 4.1 release.
Heads Up!
This blog post is about an older version of Dataiku. See the release notes for the latest version.
Dataiku 4.0.5: Between our major releases, we drop pretty awesome minor ones.
As with any minor release, there are a lot of small improvements and bug fixes (check out the full release notes). But there are some killer tracks (if you will) in here too, especially for data scientists and IT professionals.
For data scientists, Dataiku 4.0.5 brings feature selection, which allows you to select and reduce the number of variables used in your model. The new release also features the integration of isolation forest, which is an algorithm that specializes in anomaly detection — which means that it identifies those observations which don't conform to the patterns identified in the dataset. This is useful in areas like fraud detection and predictive maintenance — and it produces some very cool charts, as you can see here:
On the IT side, Dataiku now supports multiple Hadoop file systems, which will be interesting to those of you who augment your Hadoop clusters with cloud storage. Also, for organizations that use SAML for a single sign-on to browsers, now Dataiku — which, of course, is accessed entirely via a browser — supports SAML.
We'll have more news about our main course, Dataiku 4.1, coming up shortly. But in the meantime, enjoy 4.0.5!