We are pleased to announce that we have partnered with Microsoft to offer Dataiku as a third-party application on HDInsight, the managed Hadoop and Spark services running on top of the Azure cloud.
This integration gives users the ability to design their data science pipelines using Dataiku and run them at scale on Microsoft HDInsight without worrying about setup or deployment. Dataiku can be installed in a few clicks directly from the Azure portal on a new or existing HDInsight cluster, and can be installed in a few clicks directly from the Azure portal on a new or existing HDInsight cluster and pre-configured with Spark, H2O, and R.
Dataiku will extend the capabilities of Microsoft HDInsight by offering many features around data preparation, data analysis, data visualization, and machine learning using both visual and code-based Recipes. In turn, Microsoft HDInsight will act as an enterprise-grade, distributed backend for Dataiku by providing large-scale storage (through Azure Blob Storage) and powerful processing (through MapReduce or Spark) systems.
In addition, Dataiku will also provide a way to easily complement HDInsight with additional Microsoft technologies, such as SQL Server, Azure Blob Storage, or Excel, running either on Azure or on premises, to create complete Microsoft-powered data platforms.
With the Dataiku and Microsoft HDInsight integration, our customers will be able to scale their data science pipelines whatever their use cases or dataset size and focus on creating value with new data services and products instead of worrying about technical considerations.
To learn more about Dataiku and Microsoft HDInsight: