Scale Your Data Science Projects With Dataiku and Microsoft HDInsight

Dataiku Product Thomas Cabrol

We are pleased to announce that we have partnered with Microsoft to offer Dataiku as a third-party application on HDInsight, the managed Hadoop and Spark services running on top of the Azure cloud.

This integration gives users the ability to design their data science pipelines using Dataiku and run them at scale on Microsoft HDInsight without worrying about setup or deployment. Dataiku can be installed in a few clicks directly from the Azure portal on a new or existing HDInsight cluster, and can be installed in a few clicks directly from the Azure portal on a new or existing HDInsight cluster and pre-configured with Spark, H2O, and R.


Dataiku will extend the capabilities of Microsoft HDInsight by offering many features around data preparation, data analysis, data visualization, and machine learning using both visual and code-based Recipes. In turn, Microsoft HDInsight will act as an enterprise-grade, distributed backend for Dataiku by providing large-scale storage (through Azure Blob Storage) and powerful processing (through MapReduce or Spark) systems.

In addition, Dataiku will also provide a way to easily complement HDInsight with additional Microsoft technologies, such as SQL Server, Azure Blob Storage, or Excel, running either on Azure or on premises, to create complete Microsoft-powered data platforms.

With the Dataiku and Microsoft HDInsight integration, our customers will be able to scale their data science pipelines whatever their use cases or dataset size and focus on creating value with new data services and products instead of worrying about technical considerations.

To learn more about Dataiku and Microsoft HDInsight:

You May Also Like

AI Platforms in 2020 and Beyond: What to Look For

Read More

What’s New with Dataiku & Tableau: From Traditional BI to AI

Read More

Interview With First Tech Federal Credit Union: Scaling Enterprise AI With Dataiku

Read More