Get Started

Securely Scaling AI With Snowflake and Dataiku

Dataiku Product, Scaling AI, Featured Joy Looney

After a long-standing partnership with the cloud-based company, Snowflake, this past June, Dataiku was named data science technology partner of the year. Recently, in Dataiku’s December Product Days discussions, Dataiku’s Sr. Director of Product Management, JC Raveneau, was able to meet with Michael Gregory, Field CTO for AI/ML at Snowflake, to have a conversation about what governing and securing data for scaling AI with these platforms looks like. 

→ Watch the Full Video Here

Let’s hit the highlights from their talk! 

First, What’s the Problem? 

In today’s day and age, where the value of AI integration is evident, every organization wants to scale AI quickly, but the task at hand isn’t necessarily a simple one without the right knowledge in your back pocket. With goals such as better customer engagement and experience, improved employee productivity, accelerated innovation, and more, companies place large asks upon AI systems, but they don’t always understand the challenges that could crumble their initiatives.

Without further ado, here are some of the main problems that organizations need to be aware of when moving forward in scaling AI:

  • Inefficiency (unorganized strategies and decentralized pipelines are a signoff for disaster)
  • Opacity (lack of clarity in the value of a quickly growing number of AI projects likely means ROI struggles and chaotic, last-minute management)
  • Risk (risk of non-compliance with internal policies as well as risk exposure to outside regulations can build tension between business/AI teams and the governance teams who want to regain control over workflows

So, with these challenges encroaching on scaling AI efforts, what are the steps and solutions that will help organizations push back and achieve seamless growth? 

The 3 Keys to Securely Scaling AI 

AI Governance, MLOps, and Responsible AI — referred to in the talk as the three pyramids of scale — are the three aspects that come together to form a secure scaling AI strategy. The familiar focus within all three parts is centralization, prioritization of high-value projects, keeping risk in check, reliability for models, and transparency across projects. 

Going Deeper on Governance 

What if we told you that AI Governance actually has as much to do with data as it does with AI? It’s true. Arguably, AI Governance is the tip of the iceberg and cannot even happen without having a secure data governance foundation in place. Not only is data governance crucial for AI Governance, it needs to come first. Let’s take a look at industry expert Monica Rogati’s Hierarchy of AI Needs.

Building from a strong data foundation to reach top-tier AI projects, we see these steps:

  1. Collect (curation)
  2. Move/store/ingest (semantic and schema)
  3. Explore/transform (security, audit, and lineage) 
  4. Aggregate/label (pipelines and orchestration)
  5. Learn/optimize (experimentation and algorithms) 

For scalable AI, these steps must be built upon the reliability of each previous step, properly addressing data governance issues that come up during prior stages to avoid incorrectly labeling them as AI Governance issues. To effectively climb to the top tier of AI, an organization needs to make sure that two things are happening: you are not copying data and you are auditing, auditing, auditing! 

Centralized pipelines that are easy to audit, with data tethered to one place, are important for building trust and prepping for the future. Even when we believe we have a good grasp, we are not fully privy to what is coming in terms of future regulations and compliance standards. Having one shared, easily visible point that you can track changes over time from will help mitigate the risks that develop over time. Snowflake’s platform was designed with exactly that in mind! 

PREDICTED AWAY.00_04_11_17.Still028

Snowflake & Dataiku: The Right Tools Make All the Difference

Snowflake can bring unique processes into data flows while ensuring copies aren’t being made and, most importantly, all of the features that are needed to build enterprise-grade data governance are present. The platform does not reinvent the wheel, but rather makes sure that you are able to prioritize business outcomes, getting the right data to the right people at the right time, creating efficient economies of scale.

On top of these capabilities offered by Snowflake, Dataiku provides a rich ecosystem of data governance capabilities and helps teams move into the next stage — AI Governance. Dataiku’s approach finds the balance between control and autonomy at every level through enabling people across all departments and job functions to access the data flows, facilitating consistent model management for transparency as well as reproducibility, and streamlining the components of complex AI projects. 

Together, Snowflake and Dataiku create an agile system that ties together the loose ends of projects, makes tracking scaling over time a highly visible and accessible process, and addresses all of the common scaling AI challenges we mentioned above. It is a mutualistic relationship between platforms that generates value and mitigates risk in a reliable manner. 

You May Also Like

Tails, Black Swans, and Portfolios

Read More

QA in Data Science: How to Spend Less Time on Data Prep Tasks for Analytics and AI Projects

Read More

AI Cultural Change From First Principles

Read More

From the Lab to the Enterprise: Getting Your Work Adopted Across the Organization

Read More