Make Deploying and Managing AI in the Cloud Easier With Dataiku for Azure

Dataiku Product, Tech Blog Timothy Law, Xavier Thierry

According to Gartner®, “More than 85% of organizations will embrace a cloud-first principle by 2025 and will not be able to fully execute on their digital strategies without the use of cloud-native architectures and technologies.”*  In a cloud-first world, the challenge for analytics and business leaders is to enable their teams with the best platform capabilities, while also reducing the burden on IT to deploy and manage AI in the cloud. They need to provide their teams with the best AI, machine learning, and analytics platform, while leveraging cloud infrastructure for elastic scale, security, and fast deployment. 

That’s why Dataiku has partnered with Microsoft Azure to enable rapid deployment and easy management of the Dataiku cloud AI platform on Azure. 

Dataiku’s new cloud stack accelerator for Azure delivers a complete set of AI capabilities for business, analytics, and data science teams that takes full advantage of the Azure cloud infrastructure and managed services. It does so via a completely templated approach to deploying, maintaining, and upgrading your Dataiku AI platform on the Azure Cloud. 

With the new Dataiku cloud stack accelerator capability, cloud architects and administrators can automate the deployment, configuration, and management of Dataiku's Everyday AI platform. The template-driven, clickable interface makes it easy for administrators to control and manage the deployment of elastic cloud AI, onboard new groups of users, and maintain, backup and restore, and upgrade Dataiku for Azure. 

Enterprises can be up and running with a full AI stack in just three steps leveraging your Azure Resource Manager with Dataiku’s cloud stack accelerator. 

custom deployment

Step 1: Virtual Network, Permissions, and Security

The Dataiku cloud stacks accelerator deploys Dataiku instances within your Azure Virtual Network. It uses the network security components that help isolate and secure the infrastructure for secure deployments with Azure routing controls, resource access controls, and network security rules. 

Your Azure Cloud administrator can create a managed identity for your cloud accelerator instance and for your Dataiku instance within your Azure portal. In your Microsoft Azure portal, click on “Create a resource” and follow the prompts to set up the managed identity for each instance. Dataiku allows you to use single sign-on with Azure Active Directory, so you can leverage your existing authentication mechanism and can be quickly configured once you have installed Dataiku.

Screen Shot 2022-06-08 at 11.30.06 AM

Now you are ready to deploy cloud stacks accelerator inside your virtual network. You can begin by using the “Deploy to Azure” button from within your Dataiku documentation or by logging into Azure Resource Manager (ARM). Cloud stacks accelerator templates are launched from within the ARM console and managed entirely within your Azure account, so you are already operating in your secure cloud environment. 

From within the cloud stacks accelerator templates, you define your virtual network, including the label (name of the network), Vnet id, subnet name, and network security groups. By default, the Dataiku cloud stacks accelerator automatically creates network security groups when creating the virtual network. You can also manually list network security groups via the templates you want attached to the created instances. In this step, you assign IP addresses, manage DNS settings, and configure HTTPS, certificates, and encryption. Cloud stacks accelerator templates are customizable to the particular needs of your organization so you can readily incorporate your networking and security requirements.  

Step 2: Deploy Dataiku for Azure

Now you’re ready to deploy Dataiku’s AI and analytics platform securely within your Azure Vnet. Dataiku has developed four out-of-the-box deployment templates for users to deploy Dataiku with everything required to start using the platform, build and develop AI and analytics projects, and productionize them in Azure.

deploy full fleet

These deployment templates offer various architectural blueprints, from single-node design environments for building data pipelines and models to elastic environments powered by Kubernetes for small and mid-size data science teams. 

For most customers, the preferred deployment template is Deploy Full Fleet. This template deploys a complete, enterprise-ready elastic AI stack with the ability to provision, manage, and scale elastic AI compute clusters. This includes connections to managed storage services like Azure Blob Storage, Azure Data Lake Service. And this supports tasks such as data preparation, and use of compute infrastructure like Kubernetes for model training and deploying to production — all running inside your secure Azure cloud. At this point your data scientists and analysts have everything they need to build and productionize AI and analytics projects with Dataiku for Azure. 

If your teams need to analyze big data,  deep learning, or execute projects involving natural language processing (NLP) or image recognition, they can also leverage Dataiku’s native connections to additional Azure managed services. Within Dataiku, users can quickly connect to managed services, including Azure Synapse SQL Server, Azure CosmosDB, or cognitive services for NLP or vision, such as Azure Text Analytics or Azure Translator. They can use Dataiku native integration with additional tools like PowerBI, Visual Studio, Github, and OneDrive, and collaborate through connections to SharePoint and Teams.

Dataiku and Azure

Step 3: Maintain, Expand, and Upgrade Dataiku for Microsoft Azure

Dataiku cloud accelerator reduces overhead associated with maintenance, onboarding, and upgrading through the fleet manager feature.  IT operators or administrators can manage daily tasks through the fleet manager visual interface, relieving the burden on IT. And because the templates are easy to use, IT can maintain control if desired, or delegate platform administration to a Dataiku administrator to reduce the IT burden. And administrators can easily monitor Dataiku instances through a centralized visual interface.

The Dataiku cloud stacks accelerator also makes it easier to upgrade to future versions of Dataiku. Dataiku pushes new releases made available in the visual interface so you can upgrade in a few clicks. And enterprises still retain full control over timing of upgrades,  which are always done under administrator control and are not automatic. Simply select the new Dataiku version from the drop-down menu to upgrade. Click to reprovision the instance. The Dataiku cloud stacks accelerator will automatically start a new Azure virtual machine with the new Dataiku version and will reattach the data volume to the new version.

example elastic fleet

Finally, you can also use the visual interface to set your backup and restore procedures by setting recovery point objectives and snapshot frequency for disaster recovery. Users can set manual or automated snapshots. Dataiku leverages Azure infrastructure for recovery capabilities to ensure you can easily recover and redeploy to avoid loss of data and AI projects. 

With this easy, three-step, templated and menu-driven approach, manual coding is eliminated for most tasks and the deployment of a complete, elastic AI and analytics stack can be completed within hours versus days or months. 

With the new Dataiku cloud stack accelerator for Azure, analytics leaders, cloud architects and administrators can help meet the challenges of AI in a cloud-first world and continue to leverage their existing investments in Azure.

*Gartner Press Release, “Gartner Says Cloud Will Be the Centerpiece of New Digital Experiences”, 10 November 2021. GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and is used herein with permission. All rights reserved.

You May Also Like

How to Build Tailored Enterprise Chatbots at Scale

Read More

Operationalizing Data Quality: The Key to Successful Modern Analytics

Read More

Alteryx to Dataiku: AutoML

Read More

Conquering the Data Deluge Through Streamlined Data Access

Read More