Dataiku Turns Untapped AI Potential Into Real-World Business Impact With NVIDIA

Use Cases & Projects, Dataiku Product, Partner Ned Martorell

So, you're an analytics leader, enterprise architect, or someone who enjoys staying up to date with the latest trends in agentic AI. You heard about the recently announced NVIDIA Enterprise AI Factory validated design, where Dataiku is highlighted as one of the  third-party agentic AI platforms and are wondering, “What does this mean for me and my dreams of enterprise agentification?” 

If this rings at all true, then read on, you've come to the right place! 

This collaboration between Dataiku and NVIDIA amplifies our core mission of democratizing analytics, models, and agents within enterprises by enabling more users to harness high-performance NVIDIA infrastructure for transformative innovation. As a validated component of the full-stack reference architecture, you can be assured that any agentic application developed in Dataiku works on the latest NVIDIA-Certified Systems, including NVIDIA RTX PRO Server and  NVIDIA HGX B200 systems.

In this blog post, I cover:

  • Functional Architecture: A technical overview of Dataiku and NVIDIA Enterprise AI Factory validated design, with a focus on where Dataiku fits in
  • Agent Building: A concrete example of building a visual agent in Dataiku
  • What Next: A few words on the future of collaboration with NVIDIA

Technical Details: Integrating NVIDIA NIM in Dataiku and Building Agents

Let's dive into some of the details! Stephen Hawking famously wrote that "each equation included in a book would halve the sales," so in the interest of my future career as a best-selling author, I'll keep this section conceptual. Please read on. 

In this section, we cover how to: 

  1. Use Dataiku to deploy NVIDIA AI Enterprise software, specifically LLM NIM microservices, onto an OpenShift cluster running on NVIDIA DGX B200 systems
  2. Leverage the deployed models to build agentic workflows in Dataiku

Set up with Dataiku and NVIDIA

Deploy NVIDIA NIM Using Dataiku

Let's start with some assumptions.

Running your workloads on an OpenShift cluster powered by NVIDIA B200 systems delivers top AI performance. For this setup, let’s assume your infrastructure team has provided you with ClusterAdmin privileges and there’s already a Kubernetes StorageClass in place that satisfies the NIM Operator’s requirements. This means you’re ready to hit the ground running and focus on deploying and scaling your AI applications, rather than wrangling permissions or storage configurations.

In addition, we assume that a Dataiku Design Node has been installed on a CPU (virtual) machine that is not part of the cluster.

To keep things simple, we'll be exposing  LLM NIM microservices through a NodePort service instead of using an ingress controller and load balancer. Therefore, aside from being able to communicate with the cluster's API server, Dataiku  initiates TCP connections to the cluster nodes on any port.

Armed with these assumptions, let's deploy!

  1. Install and configure the NVIDIA NIM plugin on Dataiku.

Install the NVIDIA NIM plugin from the Dataiku plugin store. This plugin is required to deploy NIM microservices to the attached OpenShift and Kubernetes clusters and then connect to the deployed NIM.

To configure the plugin, add your container registry authentication info to the plugin's settings page. NVIDIA NIM plugin supports pulling container images and NIM models objects from both NGC and third-party artifact registries, such as JFrog

jfrog

Note that some of the plugin functionalities discussed in this article (namely,  NVIDIA NIM deployment capabilities) are currently in private preview. If interested in being an early adopter, please reach out to your Dataiku representative.

2. Attach the OpenShift Cluster to Dataiku.

This step is fairly simple: Retrieve a kubeconfig file from the OpenShift cluster console, and use it to configure an unmanaged cluster in Dataiku.

NVIDIA NIM

3. Deploy the NIM Operator to the cluster. 

Here's where the NVIDIA Enterprise AI Factory ecosystem really starts to shine! As part of the partnership between NVIDIA and RedHat, NVIDIA GPU Operator and NIM Operator can be installed with one-click from the OpenShift OperatorHub

Not using OpenShift? You can use the "NVIDIA NIM" cluster action in Dataiku to one-click deploy the GPU and NIM Operators to the Kubernetes cluster of your choice.

NVIDIA NIM operator

4. Deploy LLM NIM microservices to the cluster.

This step showcases one of the things Dataiku does best: We make complex and powerful technology accessible to all.

Simply put: Use the "NVIDIA NIM" cluster action to deploy any NIM (LLM or otherwise) to the attached OpenShift cluster.

NVIDIA NIM services to the cluster

Once the NIM has successfully deployed (give it a couple minutes, some of these models are quite big!), run the "NIM Services: Inspect" sub-action to retrieve the model endpoints.

results

5. Configure NVIDIA LLM NIM as a Dataiku LLM Mesh connection. 

As the final step before you can use NVIDIA LLM NIM in your agentic workflows, you need to connect it to the Dataiku LLM Mesh. It's really simple — just follow these instructions.

custom LLM connection

NVIDIA LLM NIM microservices are now deployed and connected to the Dataiku LLM Mesh!

Benefits of Dataiku and NVIDIA NIM

Dataiku LLM Mesh enables rapid application development while ensuring security, cost control, and compliance. With both code-free and code-first tools, Dataiku empowers everyone in your organization to create and use trusted GenAI and agents.

What this means for you — whether you are an IT leader, enterprise architect, or even a developer — is that you can now unleash the power of NVIDIA LLM NIM microservices within a unified, governed, and safe environment.

As first-class citizens of the LLM Mesh, NVIDIA LLM NIM microservices:

  • Are now fully integrated into the Dataiku security and safety framework
  • Empower users to build no-code, code-first, and hybrid agentic workflows
  • Enable LLM-powered transformations (for example, through our visual LLM recipes)
  • Power our chat UI interfaces

A Concrete Example: Dataiku Assistant Visual Agent

Now that you have a primer on what makes the Dataiku and NVIDIA collaboration so promising, let's see it in action — a Dataiku Assistant agent to turbocharge your workflow.

The agent leverages the NVIDIA Llama 3.1 8b NIM for chat completion and two Dataiku managed tools for tasks the model can’t handle on its own: a vector database populated with Dataiku's technical documentation, knowledge site, and developer guides — plus a trusty Google search tool to fill in any gaps not covered in these house materials.

new visual agent in Dataiku

Creating a visual agent in Dataiku is as easy as this image suggests. Load the technical documents into a Dataiku project (e.g. as an S3 folder or Snowflake table), use an embedding recipe to create a knowledge bank ( accelerated by the NVIDIA Mistral 7b NIM embedding model), configure the vector search and Google search agent tools, add these tools to a visual agent (with a brief description of how the tool should be used) — and you have yourself an agent!

We can now use Agent Connect, one of the Dataiku chat UIs, to interact with the agent. Let’s ask it a few questions to evaluate the responses and even inspect which tools and resources the agent used! 

Dataiku Agent Connect

This Is Only the Beginning

Dataiku has been successfully democratizing access to analytics and AI in the enterprise for over 12 years — from the early days of enterprise Hadoop to the public cloud migration frenzy (and subsequent cloud repatriation adjustment), to searching for the perfect data lakehouse. No matter the paradigm shift, Dataiku enables the full value of your infrastructure and analytics investments.  

Now, during the GenAI and agentic AI revolution, we remain committed to our core values. As part of NVIDIA Enterprise AI Factory validated design, Dataiku will empower the entire enterprise, from the deeply technical to those with decades of business expertise, to leverage NVIDIA accelerated computing and NVIDIA AI Enterprise software to build agentic workflows.

You May Also Like

Create and Control AI Agents at Scale With Dataiku

Read More

How CoEs Can Use the 6S Framework to Scale Self-Service and AI Agents

Read More

5 AI Agent Use Cases to Kickstart Your Team's Transformation

Read More

From Chaos to Control: Top Moments From Everyday AI New York 2025

Read More