Level Up: LLMOps and MLOps With Dataiku and Snowflake

Use Cases & Projects, Dataiku Product, Scaling AI Patrick Masi-Phelps

AI in the enterprise is much more than modifying a Jupyter notebook you found on Medium. It’s difficult! 

Machine learning (ML) engineering teams can spend months training models, implementing RAG workflows, tuning parameters, and evaluating performance. 

Once a model is looking good, companies — especially in regulated industries — face months of approvals from risk and compliance, especially if it uses GenAI.

If a model is net-new for the business (i.e., not just a retrain) we’re in for months of integration and deployment work — turning a .pkl file into a RESTful API service with proper authentication and auto-scaling.

Dataiku and Snowflake have been a powerful pair for analytics stacks for years, fostering secure, fast, enterprise-grade solutions. In this blog, we will take you through our latest joint integrations on LLMs and MLOps — reducing time to production from months to just days.

Dataiku LLM Mesh Connection to LLMs in Cortex AI, Including Arctic

Snowflake’s Cortex AI provides access to industry-leading LLMs from Mistral, Reka, Meta, Google, and of course Snowflake, with their new Arctic model. For organizations concerned with data leaving their Snowflake instance, Cortex ensures that prompts, queries, and responses from LLMs stay in a company’s Snowflake environment. 

At Dataiku, we believe interchangeability of LLMs is key to finding the best LLM for a use case, and future proofing applications as new LLMs hit the market. Whataburger found this interchangeability essential in their tests of 3 different LLMs to perform sentiment analysis on 10,000 customer reviews per week. Snowflake Cortex has a number of options like Llama3, Mistral large, Arctic, and more.

In Dataiku, you can now create a connection to Cortex AI models, grant access to particular user groups, and add safety features like PII and toxicity detection with a check of a box.

 

From here, we can build LLM-powered apps quickly. 

Take this example of call center transcripts. We can engineer a prompt asking Mistral Large on Cortex to summarize each call, give us the primary call topic, whether the issue was resolved, and a customer sentiment score on a scale from -1 to 1. 

 

We can run the same prompt using Llama2 and Arctic, then compare the LLM outputs on a sample set of call transcripts that have been summarized and verified by the call center management team manually.

We can look at metrics like BERT score, answer correctness, relevance, and faithfulness — looks like Llama2 performed best for this use case!

 

Now we can create a scenario to run all transcripts, stored in a Snowflake table, through this same prompt, hitting the Llama2 model every day at 5am. We can add a Slack webhook to send each call center agent a summary of the previous day’s calls.

 

As with all LLM Mesh connectors, Dataiku’s LLM Cost Guard allows teams to oversee and control Snowflake Cortex AI costs (by application, services, users, or projects) and diagnose issues.

With Dataiku and Snowflake, you can build LLM-powered apps, add protections around them, and productionalize workflows in days rather than months.

Dataiku APIs to Snowpark Container Services

When Snowflake introduced their container platform, Snowpark Container Services (SPCS), one particular use case stood out for us at Dataiku: deploying containerized models for real-time inference. 

With Dataiku’s new API deployment option to run in SPCS, ML engineers can easily deploy visual ML models for prediction and clustering, custom Python models, and arbitrary Python functions as containerized services in SPCS.

Taking a credit card transaction fraud use case, we can train an XGBoost model to predict fraudulent transactions based on purchase characteristics, cardholder history, and merchant history. With a few clicks, we can deploy this model to SPCS as a RESTful API service.

 

We can monitor the model’s uptime, queries, and responses with Dataiku’s Unified Monitoring tool.

 

And we can write a scenario to retrain this model monthly, check the ROC AUC is above a certain threshold, and update the SPCS deployment automatically.

 

With this new integration, enterprises can embed ML models in real-time applications,  deploying pre-built inference images to SPCS in minutes rather than months, and you now have a complete stack extending to LLM and MLOps. 

We’re excited to see what you’ll build!

You May Also Like

Alteryx to Dataiku: Best of 2024

Read More

Generative AI Finance Use Cases: Constraints of Automation

Read More

A Dizzying Year for Language Models: 2024 in Review

Read More

Frende Forsikring: Simplifying Claims Reporting for Customers

Read More