Ensuring Responsible AI Across the Entire ML Lifecycle

When AI failures make headlines because they have created unanticipated and potentially problematic outcomes, this is not unique to one specific use case or industry. If you are utilizing AI, this is something that is likely on your radar, but having good intentions with AI utilization is simply not enough. In fact, most organizations do not anticipate or intend for the negative consequences of AI failures. The harmful results are not reflective of an organization's intentions or values, yet still, these unintended consequences of AI occur every day, regardless. So what can be done?

In a recent Dataiku Product Days session, we explored how organizations can go beyond algorithms to practice Responsible AI across the entire machine learning (ML) lifecycle and avoid the harmful impact of AI failures. This blog will highlight the key takeaways from the session.

How to Mitigate AI Failures With a Responsible AI Framework

In order to combat the problems associated with AI failure on the front end, organizations should apply a framework for Responsible AI. This kind of framework serves as a guide to building out the processes needed to practice AI responsibly.

There are three main concepts integral to a Responsible AI framework:

Fairness
Accountability
Transparency/Explainability

These concepts provide a starting point for assessing the quality of an AI pipeline, and important questions that correspond to organizational values and principles can be linked to each key concept. By considering these questions, organizations begin to view and integrate AI with a responsible outlook both from an internal perspective and from the perspective of business users.

Fairness: “Does the AI system distribute outcomes in a way that is consistent with our concept of fair?”
Accountable: “Who is responsible for each decision in the building and deployment process?”
Transparent: “What level of transparency is necessary for the business and its stakeholders? How is this delivered?”

Beginning a Responsible AI Framework

Beyond an algorithm, the AI pipeline encompasses many steps all the way from defining a problem and designing a solution to building and monitoring that solution. To really make sense of a Responsible AI framework, we should hone in on the build portion of the pipeline — the ML lifecycle where practitioners explore and experiment with data and models.

Understanding the Building Stage of the AI Pipeline

In the build stage, there are three phases of the ML pipeline where a Responsible AI framework can be applied:

Data processing
Model building
Reporting

3 stages of the ML pipeline

Responsible AI in Data Processing

Data is a reflection of the world, so it stands to reason that it will carry built-in biases in various forms. Biased data can come from an imbalance of data from bad collection practices or it can be from a human input into data that reinforces social biases. In both of these instances, this means that there are sensitive correlations that are not inherently obvious that must be addressed. Additionally, there exists a danger of proxy variables in many datasets.

Practitioners can mitigate the impact of bias through proper identification and action. To uncover bias that has been proliferated through the ways mentioned above, practitioners can use traditional statistics to measure bias and correlation as well as conduct a thorough evaluation of the ways in which the data is collected and sourced. Asking domain experts to interrogate high-risk datasets is another important step.

The Model Building Stage

Assuming we have accounted for the bias in the data, we can move on to the model stage where model fairness plays a big role. There are both mathematical and socially-minded components to model fairness. A model that performs well in terms of mathematical fairness will occur when fair predictions are independent of sensitive attributes. Looking to socially-minded fairness, a fair model performs consistently across different groups in such a way that outcomes are aligned with desired intentions. Why does all this matter?

For just one example, in a number of studies, researchers have found that biased data has resulted in biased models in which fewer people of color are flagged as needing extra care from doctors — even with symptoms present worse than those of their white counterparts. What this means is that models used by healthcare systems often reinforce existing discrimination even if those models perform well according to standard ML metrics. Consciously applying a Responsible AI framework can make a large and important difference.

healthcare ai systems biased

Approaching Models With a Responsible AI Outlook

Responsible AI in practice requires a knowledge of the types of model fairness. There is group fairness — where the model performs equally on key metrics across groups, and there is individual fairness — where similar individuals produce equal treatment probabilities. Let’s also not forget counterfactual fairness — where altering the sensitive attribute does not change the prediction. All three types of fairness should be measured for a Responsible AI framework.

How We Share Models is Equally Important to Building

Reporting on models comes with a few common challenges. Business users and consumers are quite wary of black-box models, and traditional AI systems can be difficult to audit and trace. Lack of documentation is also a huge problem. Without proper context and understanding of model inputs, even fair AI can be misused!

How can we move past these challenges?

Transparent reporting provides model context and individual explanations for given predictions. It also gives end users the opportunity to explore different model outcomes and find actionable recourse. By documenting each stage of the data and model building pipeline it will be easier to review and correct problems when they do inevitably pop up.

The Larger Scope of a Responsible AI Framework — AI Governance

We’ve covered the way that Responsible AI can improve model building, but these practices are not isolated from the rest of the AI pipeline. The responsible AI framework needs to be grounded in the larger foundation of AI Governance.

As we have established already, an AI pipeline is more than an algorithm. From problem definition all the way to model monitoring, we must balance the control and agility of AI models to scale analytics and AI safely.

AI governance

In conclusion, organizations should infuse responsible values and principles from development all the way to deployment with a carefully designed framework in order to avoid the unintended, harmful consequences of AI failures and effectively scale AI throughout their business processes.

Ensuring Responsible AI Across the Entire ML Lifecycle

How to Mitigate AI Failures With a Responsible AI Framework

Beginning a Responsible AI Framework

Understanding the Building Stage of the AI Pipeline

Responsible AI in Data Processing

The Model Building Stage

Approaching Models With a Responsible AI Outlook

How We Share Models is Equally Important to Building

The Larger Scope of a Responsible AI Framework — AI Governance

You May Also Like

5 New Dataiku Features to Streamline Your RAG Pipelines

Dataiku Is a Gartner Peer Insights Customers’ Choice

2025 Retail & CPG Trends: Hyper-Personalization, GenAI, & More!

Keep Track of All Your Models (Including LLMs) With Dataiku

Ensuring Responsible AI Across the Entire ML Lifecycle

How to Mitigate AI Failures With a Responsible AI Framework

Beginning a Responsible AI Framework

Understanding the Building Stage of the AI Pipeline

Responsible AI in Data Processing

The Model Building Stage

Approaching Models With a Responsible AI Outlook

How We Share Models is Equally Important to Building

The Larger Scope of a Responsible AI Framework — AI Governance

Responsible AI Is About More Than Models

Subscribe to the Dataiku Blog

You May Also Like

5 New Dataiku Features to Streamline Your RAG Pipelines

Dataiku Is a Gartner Peer Insights Customers’ Choice

2025 Retail & CPG Trends: Hyper-Personalization, GenAI, & More!

Keep Track of All Your Models (Including LLMs) With Dataiku