Maximizing Enterprise Data Products Distribution

Dataiku Product, Featured Marie Merveilleux du Vignaux

What’s the difference between traditional data outputs and data products? Jean-Guillaume Appert, senior director of product management at Dataiku, and Marko Stojsavljevic, business transformation expert at Dataiku, answered this question (and more!) in a recent Dataiku Product Days session. 

How? The experts tackled this question by likening traditional data outputs to purchasing raw materials and data products to carefully packaged, ready-made products curated to address a specific need. Let’s dive into this analogy to understand the difference. 

→ Watch the Full Product Days Session

What’s the Difference? 

Traditionally, businesses have been dealing with data outputs and trying to make business decisions based on them. The main difference lies in the specificity and packaging. Marko explained:

The key difference is the difference between traditional data outputs or serving the business with some data … versus a data product that has a specific output in mind that's specifically packaged and curated in order to serve a particular need.

How to Implement a Data Product? 

This image demonstrates why data products are preferable to data outputs. The implementation of data products, however, isn't without hurdles. Trust in the data, ownership, stewardship, and marketing all pose a set of unique challenges. With a small team of data and AI builders catering to the various data and AI needs in distinct business domains, the complete alignment, reactivity, and scalability become substantial hurdles. Jean-Guillaume acknowledged these challenges, stating, "There is definitely an alignment problem or reusability aspect that we have to do … there is a scalability aspect."

To overcome these challenges, organizations often need a complex workflow system. However, these may not effectively tackle the issues of reactivity and the burgeoning needs. To address company-wide product distribution, Jean-Guillaume and Marko suggested a more federated or decentralized model. The idea revolves around a central team producing the majority of data products while granting peripheral teams and individuals autonomy to craft their own according to their data needs. This structure helps achieve the necessary balance between centralization and decentralization.

black and white cartoon

The Role of Dataiku

Dataiku, the Universal AI Platform, offers various capabilities to accompany organizations in these efforts. Some features outlined in the session include AI data preparation, data quality checks, workspaces for collaboration, a data catalog for reuse, and robust governance features. 

These capabilities — and more — bring considerable value regardless of distribution models. The session outlined these different approaches to scaling and showed how Dataiku can support each model.

  1. Centralized: The centralized approach leverages a data catalog to help curate and consume datasets. Dataiku ensures a quick and efficient data product development process for centralized teams.
  2. Federated: The federated approach, on the other hand, creates workspaces for consumers to use dashboards or insights. Dataiku encourages collaboration in a federated model.
  3. Decentralized: The decentralized approach enables sharing best practices and data across different business lines. Dataiku provides visual recipes and AI-driven data preparation for decentralized users.

The platform offers an array of capabilities to support different operating models, facilitating efficient data product development, monitoring, and governance. Jean-Guillaume notably highlighted the significance of data governance in enforcing rules in compliance. Such features foster visibility into projects outside of one's ecosystem, thereby promoting collaboration and value creation.

The Road Ahead

The session ended with the introduction to brand new Dataiku features like lineage and data quality improvements, including the ability to track metrics over time to track trends and drift. Learn more about data lineage in this blog post and incorporate these features into your workflows to leverage data products to transform your enterprise capabilities effectively.

You May Also Like

Understanding the Why and How of the LLM Mesh Architecture

Read More

AI Isn't Taking Over, It's Augmenting Decision-Making

Read More

The Ultimate Test of ChatGPT

Read More