Who Is Responsible for Data Quality Within an Enterprise?

Scaling AI Catie Grasso

Poor data quality is troublesome for a myriad of reasons, from analyst productivity lost, to bad reporting and decision making, to potential negative impacts on revenue. It is indeed true that data has a credibility problem and 50 years after “garbage in, garbage out” was coined, organizations still massively struggle with data quality. 

→ Don't Let Your AI Efforts Fall Flat: Data Quality for CDOs

But where should CDOs and data executives begin when it comes to jumpstarting this significant undertaking of data quality management? According to McKinsey’s “The Data-Driven Enterprise of 2025” report, the CDO role will expand in the next few years to generate incremental value. The report states, “Today, CDOs and their teams function as a cost center responsible for developing and tracking compliance with policies, standards, and procedures to manage data and ensure its quality.” 

In the future, though, CDOs and their teams will “function as a business unit with profit-and-loss responsibilities” which include finding new ways to use data and developing a holistic enterprise data strategy. CDOs and data executives need to be aware of data quality issues (think unlabeled, poorly labeled, or purely inaccurate data, for example) and build out a comprehensive strategy for measuring success when it comes to data quality. The change won’t happen overnight, though. In fact, Jeff McMillan (Chief Data and Analytics Officer, Morgan Stanley Wealth Management) shared that the data quality efforts at Morgan Stanley have taken about five years to implement in a meaningful way and today make up one of the company’s competitive advantages.

Everyone Has a Stake in the Game

As we’ve outlined, data quality isn’t only important for the practitioners behind the scenes, building the machine learning (ML) models. It impacts the business (from C-suite executives to lines of business to analysts) just as much because data is a critical element in ML success at an organization-wide level and is necessary to help teams achieve their business objectives and hit their KPIs. While data quality certainly matters to any industry, it’s critical to understand and illustrate its applicability across different business domains. A few examples include:

megaphoneIn marketing, prospects and customers can become easily annoyed if they receive the same campaign more than once (i.e., with their name or address spelled slightly differently). This could link back to duplicates within the same database and across a variety of internal and external sources. 

shipping boxIn industries that are highly reliant on supply chain logistics (i.e., manufacturing, retail and CPG), maybe you don’t have reliable location information to automate processes or, worse, may send products to the wrong addresses which can lower customer satisfaction, loyalty, and advocacy.

person asking questionAdditionally, out-of-date customer information may result in missed opportunities for upsell and cross-selling products and services.

banksFor banking and financial services, you might have inconsistent data (i.e., using error-prone spreadsheets to generate financial reports), varying freshness of data, and muddled data definitions which can cause different answers to be given to the same question.

data privacyRelatedly, data quality is often key when it comes to meeting compliance requirements such as GDPR and other privacy regulations. At the drop of a hat, organizations need to be able to locate an individual’s information — without missing any of the collected data due to inaccuracies or inconsistencies.

Further, IT owns data quality, but they don't really know anything about business data, only these aforementioned business stakeholders do. Therefore, as tempting as it may be for IT to wholly own data quality, centralization without a larger goal or purpose (that has the business's buy-in) won't actually generate business value and, ultimately, will result in data quality efforts falling flat.

When it comes down to it, most organizations don't have a repository of high-quality and trusted datasets. And, when they do, they may not be accessible in any simple way and available for constant reuse and are, instead, commonly siloed or fragmented. Therefore, the problem of data quality isn’t always a technological one, but an organizational one that requires synergy across all teams and, whether you are a C-suite executive, data executive, or line of business manager, you have a role to play.

You May Also Like

5 New Dataiku Features to Streamline Your RAG Pipelines

Read More

Dataiku Is a Gartner Peer Insights Customers’ Choice

Read More

2025 Retail & CPG Trends: Hyper-Personalization, GenAI, & More!

Read More

Keep Track of All Your Models (Including LLMs) With Dataiku

Read More