Why Meeting the ESG Demand Is a Data and Modeling Challenge

Data Basics, Use Cases & Projects, Dataiku Product Sophie Dionnet

When embracing ESG, all financial players are confronted with three main challenges: navigating the ESG data jungle (which is getting thicker by the minute), building the right ESG models and analytics for the right processes, and navigating ESG embedding while maintaining explainability and governance. We will highlight each of these in the sections below.

1. Navigating the ESG Data Jungle 

Assessing the financial health of a company is supported by codified KPIs, based on accounting and financial disclosure rules which have been maturing for more than 50 years. ESG is still a new space, with a lack of clear metrics and modeling approach for its different dimensions. 

Going back to the Dataiku example, if I want to assess its environmental footprint, which indicators should I use? 

  • Dataiku’s environmental report
  • An aggregated score delivered by an external agency? 
  • Raw data provided by other players? 
  • Public information?

And on which scope? Dataiku only? Its suppliers and customers? Finally, when I talk about the environment, should I focus on CO2 only? This is only a sample of questions on one sub-ESG dimension and on one company of a specific sector. As illustrated by the European Central Bank, embracing environmental risks in a holistic manner for a bank globally raises complex questions for which definitions start to emerge, but industry accepted methodologies have yet to: 

climate and environmental risk drivers

Source: European Central Bank, Guide on Climate-Related and Environmental Risks

One can easily appreciate that defining metrics, with the right data blending, is a significant challenge for all financial players aiming to embrace ESG for their full scope of activity. A direct consequence is that financial players have to navigate in a thick jungle of possible data sources (internal vs. external, raw vs. transformed, traditional vs. alternative, and so on) and have to make complex choices on how and when to use them. 

What does this imply for financial institutions? The need for agility in testing, blending, and building unique ESG signals — leveraging both ESG and traditional financial KPIs, adapted to their own processes and unique convictions — with the capacity to easily operationalize and embed them into their processes.

In this journey, having strong data science capabilities will play an essential role, with a very broad range of applications, among which:

  • Early identification of controversy signals through news analytics: 

The harsh impact of the Dieselgate in 2015 on Volkswagen’s stock price and the rush of all investors to understand their exposures and disinvest showed how material controversies can be.

To address controversies, there are two approaches: leveraging external data vendors such as Sustainalytics, which tend to have a reactive approach to events, and/or identifying early signals through news analytics or other public information screening to complete these market established data sources. 

  • Leveraging satellite images and other alternative raw data sources to perform advanced identification of climate change exposure: 

Climate change is deeply affecting the extreme weather conditions specific geos are exposed to. Better leveraging the insights alternative data can provide acts as a key differentiator to properly understanding current and target climate risks exposure — of individual houses, of infrastructures such as roads, public facilities and more, and of entire supply chains, for example. For the above reasons, this type of information should be critical for insurers, banks, and private equity firms. 

  • Leveraging AI to better assess and track impact of green bonds: 

Green bonds are on the rise, but as for the rest of ESG topics, we still lack a common definition and methodology. What investors seek with green bonds is the conviction that their investments have an appropriate impact on the ground. Using alternative data and blending multiple raw data sources can act as a key differentiator to fully measure and report on the impact of green bonds over time.

  • Using alternative data to properly understand the ESG health and competitive positioning of companies vs. peers: 

Alternative data has significant potential when it comes to ESG, most notably due to the ill-defined dimension of ESG. Credit and stock analysts can greatly benefit from the timely insights derived from alternative data when it comes to ESG, enabling them to go beyond raw KPIs. Be sure to check out the illustration of Kayrros’s work on flaring monitoring through satellite images  — a metric which only alternative data can provide.

  • Advanced testing and simulation of the correlation between specific ESG factors and financial performance to develop ESG-embedded optimizers: 

Organizations can only achieve impact with ESG if it is embedded in core processes. This is much easier said than done, as it demands significant work to define the right set of metrics and properly understand their impact, their weighting, and so on. From an agile data science standpoint, having the capacity to both do complex backtesting and develop machine learning approaches to better predict the ESG-weighted evolution of financial markets is an essential space to invest in. 

2. Building the Right ESG Models and Analytics for the Right Processes: 

Selecting ESG data sources and building ESG models can’t be done in isolation from the processes they are to impact. And the variety of processes which need to be ESG-embedded further complexifies the challenge.

Let’s take the example of a bank. As a bank, to integrate climate change in stress testing, I need to source or develop a climate change metric capable of covering my full book of business. I need to integrate it in my starting points with appropriate line-by-line matching, add it in my stress testing scenarios (with the right capacity to understand outputs), and report on them to my regulators.

If I want to develop options on low carbon or socially positive indexes, I will need specific data sources, giving me the tangible metrics which customers require. I will need to inject these in the optimization of my indexes, both in test, back-testing and in running processes, and will have to provide the right reporting to my customers. 

If I want to do an ESG-refined credit assessment of different manufacturing companies, I will need to leverage a variety of data sources, both structured and alternative, to ensure I fully assess the specific risks and opportunities of each business model. I also want to ensure that what I am doing has the right level of consistency, with the capacity to manage my risks in an aggregated manner and to answer the oversight needs of my regulators. Not to mention that everything I am building is based on moving grounds: regulations change and evolve, norms start to emerge such as the one on climate pushed by TCFD, and end-customer demands shift for deeper, more concrete impact measurement.

Long story short, when it comes to ESG, there is no one size fits all. Financial companies need to be conscious of it when building their ESG initiatives, and have approaches which allow them to both develop analytics or models specific to certain activities or processes, and foster consistency and reuse across business lines.

where to reuse ESG data

3. Embedding, Explainability, and Governance 

We have seen that ESG starts with data blending and continues with the development of the right models and analytics for the right processes, with the right balance between specialization and mutualization. What comes next is ensuring that ESG does not remain a “new topic” supported by a small team of experts, but becomes fully owned and embedded in all key processes.

How can I make sure that all my portfolio managers fuel ESG criteria in their decision making, across all investment styles, including on multi-asset total return strategies largely leveraging synthetic instruments? Can my actuaries review all pricing models and insurance offers to incorporate the appropriate ESG dimensions in them?

Answering these questions —and all other similar ones — can only be done with strong collaboration among all needed stakeholders. Asset managers can’t expect all their professionals to become ESG experts over the course of a few days, weeks, or even months and need to dedicate the time and resources to upskill them to enhance ESG innovation. Keeping this topic a specialist one is a tempting route, but it will not produce the internal buy-in nor the models that can, in the long run, deliver true ESG risk-adjusted performance.

ESG experts, core teams, data scientists, and risk teams have to work together on the development of ESG models rooted in unique ESG convictions, fitting the reality of the business processes they are to impact, with appropriate explainability for all business teams and validation from the risk team. ESG transformation fails when it is not owned by businesses.

ESG embedding in banking

Accelerating This Transformation With an End-to-End Collaborative Data Science Platform

If we summarize, delivering ESG is a question of:

  • Complex data blending 
  • Creating unique ESG signals
  • Building a mix of transversal models and analytics tailored to the specificities of each financial objective
  • Through collaboration of multiple type of experts
  • To successfully embed them in core financial processes, end to end, from decision making to end customer offering and reporting

Having an agile analytics and data science platform approach is a must have to embrace all these dimensions and manage to fuel ESG throughout organizations. The agility in testing, the empowerment given to business teams to work with data scientists in building the analytics they need, and in operationalizing outputs will act as significant transformation catalysts for these organizations. 

In an environment marked by a growing regulatory requirement for ESG, a platform approach resting on strong explainability and governance principles will guarantee robustness of ESG initiatives over time, ensuring auditability both from internal and external control bodies. 

Practically speaking, here’s how Dataiku can support: 

  • Eased data access to multiple sources and timely blending of said sources
  • The ability to leverage NLP and other alternative data sources to create unique insights
  • Test integration of ESG in core financial processes
  • Reporting to internal and external stakeholders
  • Testing and reuse, across the entire company and its processes

The ESG revolution is on the move and the appetite for all financial players to take action is there. Winners in the race to ESG will be those who not only accelerate but manage to embark a full community of professionals to drive this change. Having a collaborative platform approach (such as with Dataiku) will create the needed conditions for these initiatives to be successful. 

You May Also Like

Moving Beyond Guesswork: How to Evaluate LLM Quality

Read More

A Tour of Popular Open Source Frameworks for LLM-Powered Agents

Read More

Navigating Regulations With Dataiku’s Governance Capabilities

Read More

Custom Labeling and Quality Control With Free-Text Annotation

Read More