From LLM Mess to LLM Mesh: Building Scalable AI Applications

Dataiku Product, Scaling AI, Featured Christina Hsiao

At Dataiku Everyday AI events in Dallas, Toronto, London, Berlin, and Dubai this past fall, we talked about an architecture paradigm for LLM-powered applications: an LLM Mesh. What actually is an LLM Mesh? How does it help organizations scale up the development and delivery of LLM-powered applications? This blog post breaks down the key highlights and takeaways from that session, answering those questions along the way. 

The Necessary Architecture for LLM-Powered Apps

From an architecture perspective, what are the components that you are going to need to build LLM-powered apps? Let’s set up a structure with four layers: the data layer, the services layer, the logic layer, and the application layer.

The Necessary Architecture for LLM-Powered Apps

First, you have your corporate and external data, both structured and unstructured. These are of course traditional elements, and what you are going to need to build into your apps to make them tailored for your specific business. Unstructured data, as we know, has even more of a role to play now than ever before. 

But down at the data layer are the LLMs themselves. Why are we putting them here at the data layer? Well here, we’re actually talking about the model itself. If you're familiar with these models you hear people talking about billions of parameters that control the inner workings of many, many layers of micro-calculations. But those parameters are just a bunch of numbers, right? And if you take a large new model, like the newest versions of Meta's LLAMA model, LLAMA 3, we're talking about terabytes worth of data to store that model if you choose to self-host it. In theory, if you built or fine-tuned your own proprietary LLM  you would also store it in this layer. 

So here, essentially, we're treating the LLM, that large language model, as data.

LLMs in the data layer of LLM Mesh

Using these LLMs requires hosting them in a service, so we move up to the services layer. You can host models yourself for private use, in which case you own the responsibility of setting up and managing and maintaining the quality of that service yourself, OR you can use a myriad of off-the-shelf, hosted AI services from providers like OpenAI, Anthropic, Mistral or one of the major clouds. Also at the services layer, you have retrieval services.  

These use some of the same LLM technologies for information retrieval. For those of you who are familiar with the Retrieval Augmented Generation, or RAG approach, these retrievers are highly relevant to that method, though the category is actually larger than just that. You then have traditional services, data querying services like SQL databases and any number of other arbitrary API services. 

These last two services are leveraged as “tools” in an LLM workflow, because they will be used by LLM systems to accomplish some task. Maybe your app needs to dynamically check the weather or the traffic or the stock market in real time to make a decision. Or maybe it needs to look up details about a customer account before it knows the next best action to suggest. How you use tools for a specific use case is contained in the logic layer, because it defines what is possible and necessary for the application. 

Prompts are also in the logic layer. If you’re unfamiliar with prompts, they’re essentially the instructions and context you send to an LLM that define what you want it to do and how. Prompts are creating a need for not just new job functions, but a whole new type of asset management within your company. You’re going to be testing and developing new prompts, templating them for reuse, and you’ll need to govern and maintain them over time. 

Then everything comes together in agents. This is a new object in the enterprise IT landscape, but an extremely important one. It’s where the business or reasoning logic, the desired behavior, for an AI-powered application is designed. 

Finally, the whole system is exposed to the end user in an application, which includes its user interface, reporting capabilities, and so on. Taken all together, these are the different components that you're going to need for an LLM powered application. As you can see by the green/blue color coding, there are some common elements with traditional applications, but many new elements as well.

From Monolithic to Microservices

Throughout the years, there have been different architecture paradigms used to develop applications in the enterprise. Giving an example from the traditional world of app dev — It all started with monolithic applications, where all of the application’s capabilities were coded directly into the application itself. It was all there: from authentication, to data querying, to data transformation and display, in that big monolithic block of code. 

But over time, organizations realized that this was inefficient and overly complex, and hard to adapt. We moved to a more modular services-oriented architecture, or SOA. The SOA was popular for a while, using XML and web services to break up the application monolith. And in more recent years, we’ve gone a step further with microservices. Today, the standard paradigm is to use the JSON format to pass data back and forth between microservices, usually running on Kubernetes. It’s greatly simplified application development, and we’re all better off as a result.

monolithic to microservices

So what’s the current paradigm for building LLM-powered applications in the enterprise? 

Not surprisingly, most developers have begun by building monolithic applications on top of frameworks like LangChain (although admittedly these big code blocks do sound cooler because we call them “chains”!) And it makes sense. If you need to get started and build something quick as a proof of concept, you're likely going to use a monolithic application architecture so that the entire POC is self-contained. 

To be sure, this is a great way to experiment with LLMs. It’s fast, you can get impressive results quickly, and it helps us think through what these new applications will be capable of. It’s a good thing! But it doesn’t scale well and it's not going to be what the most successful companies are doing a few years from now. In order to stay ahead of the competition, you are going to need to build dozens, maybe even hundreds, of applications across the spectrum of your business. 

LLM mess

The LLM Mesh

Organizations need to shift the architecture paradigm. We need to think differently about how we build not just one application, but how we approach building all LLM-powered applications in the enterprise. Learning lessons from the shift from monolithic architectures to services-oriented architectures, the key here is abstraction and standardization. 

This new architectural paradigm for LLM-powered applications in the enterprise is the LLM Mesh. 

An LLM Mesh has three main principles:

  1. First, an LLM Mesh provides an abstraction layer through which the LLMs and all of the related services can be accessed. This standardizes the interface with these different services, so that you don’t need to change anything in your application if you want to change the underlying service.
  2. Second, an LLM Mesh provides federated services for control and analysis. Due to the non-deterministic nature of LLMs, their behavior, performance, and cost are not as easy to predict as traditional models. All of the apps that you will be building will need these controls, so you will need to provide them in a common and shared manner. 
  3. Third, good housekeeping and hygiene! An LLM Mesh provides centralized discovery and documentation for all of the components. This is important not only for the human builders and the end users, but also for the automated AI  agents that are going to be picking up and using these different tools and components (of course, under the direction and within the limits designed by the developers).

And so, taken together, this is our vision for how you go from that LLM mess to an LLM Mesh. 

  1. First, you establish your catalog and gateway where you have controlled access to all of the different objects that you are going to be building with. You have everything laid out available to you. Everything is standardized. Everything is documented. Everything is registered and approved for use.
  2. Then, you create your federated services for managing things like access, content, cost, performance, and relevance. Those are shared services that are going to be used by all applications. 
  3. And, with that, you start building your apps in an efficient, modular, and secure way. Reusing components, having these applications feeding off of one another in a way that is manageable, understandable, and safe. 

So that's the vision for an LLM Mesh. This is an architecture paradigm that you could develop yourself inside your own company, if you have people with the right know-how.

Where Does Dataiku Come In?

Dataiku LLM Mesh

Dataiku offers our out-of-the box LLM Mesh natively, as part of our platform. This is a cornerstone of our current and future product vision because we really believe that it is the future of how LLM-powered applications will be built in the enterprise now and in the future. TL;DR: The LLM Mesh already exists in Dataiku today.

It’s embedded into and integrated with all the capabilities you already know and love about Dataiku, so you can seamlessly move into the new normal and enrich your existing analytics and ML pipelines with the latest and greatest GenAI technologies, without breaking stride or having to onboard new tools.

You May Also Like

5 Challenges to Modern Data Insights

Read More

How to Build Trust in Data Analytics Projects

Read More

How the Dataiku Universal AI Platform Redefines Enterprise AI

Read More

The 3 Pillars for Scaling AI in Enterprises

Read More