Dataiku and Spark, a Powerful Combination

As far back as Dataiku 4.0, we've included features that help data scientists make the most out of Spark using Dataiku. Dataiku and Spark combine to help users get the most out of data science. Faster computations, PySpark, Spark Scala, Spark R, and more, plus easy upgrades.

Heads Up!

This blog post is about an older version of Dataiku. See the release notes for the latest version.

Check out the video below for ways that Dataiku makes the most out of Spark. Specifically, the video covers:

Dataiku's visual machine learning and how it works with Spark, along with the coding languages available (Spark Scala, PySpark, Spark R, and Spark SQL)
How to create multiple generic profiles on Spark via Dataiku, which allows more people in your organization to benefit from Spark
Spark pipelines, which are a new feature in Dataiku 4.0, and that enable much faster calculations in running Spark workflows
For those of you upgrading to Spark 2.x, Dataiku makes it very simple to keep all your data preparation and models just as you had them before

Dataiku and Spark, a Powerful Combination

You May Also Like

Everything to Know: AI Agents for Supplier Risk Assessment

Building AI Agents for Life Sciences: From Silos to Synthesis

Scaling GenAI in Financial Services With Dataiku and NVIDIA

How Databricks & Dataiku Embed Governance Into AI Workflows

Dataiku and Spark, a Powerful Combination

Go Further: Spark on Kubernetes

Subscribe to the Dataiku Blog

You May Also Like

Everything to Know: AI Agents for Supplier Risk Assessment

Building AI Agents for Life Sciences: From Silos to Synthesis

Scaling GenAI in Financial Services With Dataiku and NVIDIA

How Databricks & Dataiku Embed Governance Into AI Workflows