Technical debt — the accumulation of messy code, aging systems, and temporary patches that you will need to fix later — is costly anywhere in an organization. Within an analytics and AI stack, however, it can be the difference between being able to generate an insight that drives millions of dollars of ROI and simply continuing with business as usual. Or, it could be the difference between deploying a machine learning (ML) model that transforms a critical business process and showing off another ML science project.
Decisions that contribute to the accumulation of technical debt in an analytics and AI stack include:
- Signing another maintenance contract on that creaky on-premise data warehouse.
- Renewing the license on that monolithic analytics software that none of the new recruits seem to be interested in using.
- Renewing the contract with a specialized consultancy that maintains the backend of critical dashboards that haven’t been migrated to the new BI solution yet.
- Buying yet another point solution that promises quick ROI but solves only one, narrow problem. For example, as the attention of Generative AI increases, ever more startups will develop AI-powered solutions solving specific problems in the organization (e.g., AI-powered email generation for sales development representatives, AI-powered contract review for purchasing, etc.).
In each of these cases, the decision results in an analytics and AI stack that is:
- Separate from core business systems
- Not composable or modular (i.e., useful for only one purpose)
- Contributing to the siloing of teams or profiles
The accumulation of technical debt is a consequence of the distributed decision making and a desire to solve short-term problems. While any one decision that contributes to technical debt is undoubtedly defensible in isolation, the elimination of the debt requires a clear decision and concerted effort. That said, paying down that debt can pay dividends and position an organization to be more nimble in the face of changing economic conditions.
Once that decision has been made to pay down technical debt, what could a solution look like? Thankfully, there is a clear direction in the industry: the modern data stack.
The Modern Data Stack for Analytics and AI
The modern data stack for analytics and AI solves many of the problems that accumulated technical debt will cause. A lot has been written about the modern data stack, including its challenges. The benefits that it provides relative to a debt-ridden analytics and AI stack are two fold:
1. A common data layer, in the form of a cloud data warehouse
2. A modern analytics and AI layer that is cross-profile, cross-functional, and use case agnostic
The Value of a Cloud Data Warehouse
The cloud data warehouse has emerged in the past years as the predominant architecture for enterprise data. Capable of handling structured and unstructured data, and providing a familiar interface in the form of popular querying and scripting languages like SQL and Python, these platforms allow for a wide range of enterprise data to be aggregated in a single environment.
This is not to say that all legacy data systems should be deprecated immediately. There are many well-run, on-premises Hadoop clusters that still provide excellent performance at very little cost. In terms of thinking about the next generation for the data storage and compute layer, it’s clear that cloud data warehouses are the default option going forward.
The Value of a Modern Analytics and AI Layer
There are three essential attributes of a modern analytics and AI layer that will protect your organization from the future accumulation of technical debt. It is:
- Cross-profile: As companies invest in the technical upskilling of their workforces, it is important to see data skills as a spectrum. Rather than siloing an organization's coders away from the rest of its teams, they should seek analytics and AI platforms that enable cross-profile collaboration across the no-code, low-code, and full-code spectrum.
- Cross-functional: As lines of business are empowered to make their own analytics and AI decisions, there is a risk of a siloing of expertise and abilities between these teams. AI solutions that solve a problem specific to one domain (employee retention prediction for HR, omnichannel optimization for Marketing, etc.) may provide fast ROI but they are not scalable and lead to an accumulation of technical debt.
- Use case agnostic: It is tempting to see a divide between the “traditional” work of descriptive analytics and the “cutting edge” of modern ML and AI solutions. But this is a false dichotomy, the reality is a spectrum of use cases. Platforms that provide the flexibility to handle both analytics and AI use cases (including Generative AI) help avoid technical debt by providing a single, modular environment for all work on data.
Dataiku Provides the Analytics and AI Layer
Dataiku combines the three conditions needed for a modern analytics and AI layer, inclusive of Generative AI. It is cross-profile, cross-functional, and use case agnostic. Large organizations like Pfizer have taken advantage of these capabilities. With Dataiku (and more specifically, the LLM Mesh), organizations can:
- Choose the right Generative AI model for a given application. For example, choosing between a public model provided as a service, or running an open-source model on their own private infrastructure.
- Connect Generative AI and ML algorithms or models to one another and to their enterprise data.
- Enable a wide range of non-coding domain experts from across the business to participate in the development and deployment of Generative AI applications.
- Maintain complete visibility and control over their AI initiatives, ensuring full AI Governance in the context of a Responsible AI framework.
So, while technical debt is a reality for every organization, the ones that take the initiative to pay that debt down sooner rather than later are better positioned to generate value from their data and take advantage of the latest AI developments, such as Generative AI.