SaaS Rebels With a Cause

Dataiku Product, Tech Blog Joy Looney

Founded in 2013 and built upon the mission of democratizing AI, Dataiku strives to bring advanced AI applications and machine learning technology to all enterprises in an efficient and reliable fashion. 

Not even a decade later, Dataiku has grown to become a Series E-backed organization with a global workforce. However, let’s travel back in time to discuss the choices Dataiku made as a young startup, highlighting the key takeaways from Dataiku CTO Clément Stenac’s “SaaS Rebels: Why ‘On-Premises’ Is Still Alive and Here to Stay” presentation from DevoxxFR 2021

Dataiku CTO at Devoxx

Don’t Just Follow the Pack 

From the beginning, Dataiku decided to be an on-premise software, despite SaaS remaining an overwhelmingly popular choice in the industry —  hence Stenac’s coined term “The SaaS Rebels.” Hold on though! Dataiku is not just a rebel without a cause. Dataiku chose to be an on-premise software to create the path of least resistance for both Dataiku and customers.

Weighing Out the Options 

Startups in the AI and machine learning field face many questions at inception, and a paramount decision to make is whether to be a SaaS or an on-premise software organization. An extensive list of factors to consider exists for this decision, but let’s narrow down the factors to the following:

  • Installation and setup
  • Leveraging customers’ data stores 
  • Leveraging customers’ compute clusters
  • Security, high availability, and disaster recovery 
  • Free versions 

SaaS products are unbeatable when it comes to installation and setup. Users just have to open their browsers, sign up and they are done. On-premise software on the other side requires users to download, install and set up the product which might be challenging and time-consuming.

On-premise offers the benefits of more control in leveraging the compute clusters and data stores that your customers have already setup (either in dedicated data centers or in VPCs in public clouds) . With a SaaS platform, you ask your customers to upload all of their data onto your platform. Deep learning demands large amounts of data for optimal performance/ results, so hundreds to thousands of terabytes of data are being uploaded — that’s costly! 

Additionally, to create a free version with SaaS, especially as a startup, heavy costs are incurred as opposed to creating a  lightweight  version of your product that can be freely downloaded by your users. Let’s not forget to mention that with being a young startup, many large enterprises might not feel comfortable delegating the management of data security. To avoid the anxiety of relegation, many organizations prefer the self-managed security option afforded by on-premise software design.

Ok, But What About…? 

Lack of onsite IT expertise, time-consuming bug fixing, etc… Dataiku is aware of the issues that commonly creep up for on-premise software. Take a look at the practices Dataiku employs to mitigate these potential problems: 

  • Monthly releases and painless upgrades 
  • Quality first, perform hot fixes when required 
  • Produce logs and diagnostic archives 
  • Automate tests for infrastructure assessment 
  • Track user actions and send regular digests 
  • Forget about A/B testing 

Consistent maintenance of software through monthly optimized releases in addition to the bi-annual releases banishes bug problems before they bog down the workflow. The upgrades are designed to be time conscious as well! Another tip from Clément: “Do not rush to ship.” Taking time to build an architecture blueprint and automatic infrastructure testing that can run a single test code on multiple platforms will ease integration and prevent issues in advance. 

Dataiku at Devoxx

Meeting in the Middle & More to Come 

In order to adapt to changes and trends in the ever-evolving world of AI and machine learning, Dataiku offers two alternatives in addition to the original on-premise software: 

Dataiku Online

This is “the SaaS version” of Dataiku. Dataiku Online is fully managed and runs on Dataiku servers. Dataiku Online works well for small and mid-sized companies as well as startups that don’t rely as heavily on on-premise technologies or custom clouds as larger enterprise companies do. Easy integration with other SaaS products (i.e., Snowflake, Salesforce, HubSpot) makes this alternative an attractive option for some organizations’ business models.

Dataiku Cloud Stacks 

This version of Dataiku runs in the customer’s cloud subscription (AWS, Azure, GCP), leveraging cloud storage and cloud Kubernetes. An easy setup, cluster spinning, and a hands-off security and monitoring process make Dataiku Cloud Stacks the most desirable solution for many organizations, particularly large enterprises. Another approach many customers opt for is a hybrid cloud approach which combines on-premise data centers and private cloud with public cloud services, providing greater flexibility. 

What Else?

Dataiku continues to improve upon all three versions, finding the experience and version that works the best for each unique customer! 

You May Also Like

Dataiku Makes Machine Learning Accessible, Transparent, & Universal

Read More

Explainable AI in Practice (In Plain English!)

Read More

Secure and Scalable Enterprise AI: TitanML & the Dataiku LLM Mesh

Read More

Slalom & Dataiku: Building the LLM Factory

Read More