Dataiku 4.1, our brand new release, is a leap forward in our mission to bring the power of data science and machine learning at scale to people and organizations everywhere. Already, some Dataiku customers are deploying the software to hundreds of users, beyond just their core analytics teams. This new release is built to support this scale of deployment.
Heads Up!
This blog post is about an older version of Dataiku. See what's new in the latest version.
Dataiku is still the same at its essence:
- Powerful point-and-click interfaces for data preparation and analysis
- Advanced and customizable tools for cutting-edge data science
- Straightforward solutions for deploying, monitoring, and governing models in production.
Dataiku 4.1 elaborates on these elements so that organizations of any size can propagate and scale their analytical capabilities.
Powerful Data Preparation and Analysis
- More visual recipes: Dataiku’s graphical interface brings powerful data preparation and machine learning functionalities to non-coders. Dataiku 4.1 goes even further by introducing additional visual recipes (i.e., drag-and-drop actions) to allow transformation steps like pivoting, removing duplicates, sorting, and more, in just a few clicks. And since users can run these steps in-database, in Hadoop, or in- memory, they will always be able to work with the largest datasets with ease.
Efficient Machine Learning
- Live model competition: See the performance of your batch of models competing in real time, and identify obvious winners (and losers) without waiting for the entire training. Don’t waste time and resources on models that won’t meet your standards by interrupting their training (you can always change your mind later). Set the maximum training duration to make sure you deliver the best models while meeting important deadlines.
- Powerful capabilities for coders: Dataiku 4.1 brings advanced visualization libraries like R Shiny and Bokeh for creating engaging visualizations in your dashboards. Additionally, R Markdown reports let you share your results outside of Dataiku easily. Other features include the support of Python 3, as well as a brand new code editor.
Scalable Machine Learning in Production
- Reproducible environments: When an organization upgrades a code package to its latest version, there is a significant risk that older, now unmaintained projects will fall through the cracks and fail. Dataiku 4.1 eliminates this risk by supporting virtual environments, which allows your team to create a safe and isolated environment, take a snapshot of the packages and languages used for each project, and reproduce them quickly and simply. Now there is no need to worry about upgrading a package for your latest experiments, because you are always sure that the code you deployed is reproducible, forever.
- Versatile API Node: Dataiku increases its end-to-end wingspan by significantly strengthening its API node. In addition to scoring Dataiku-created, Python, and R models, it can also run any function coded in these languages. It also allows for parameterized SQL queries and database lookups.
- Expanded toolkit for plugins: Plugins in Dataiku are a primary basis for collaboration, as advanced users can build customized tools so that other users can conduct specific advanced analyses easily. This new version of Dataiku greatly expands the components available to plugin creators.