Addressing Fraud with Machine Learning: How & Why

Use Cases & Projects Lynn Heidmann

For the financial services industry (as well as many others that deal with data security and other types of non-monetary fraud), anomaly detection is hands down the most important system in operation. Yet many organizations still use more traditional modeling for fraud or anomaly detection instead of making the shift to machine learning.

stacks of euro coins

Here, we’ll look at why this is the case (that is, why there is resistance to machine learning-based fraud detection) as well as how one company - Marlette Funding, through the Best Egg Loan Platform - improved their fraud detection capacity by 10 percent by switching to a machine learning-based model.

read the full case study: ml-based fraud detection

The Resistance and Challenges

From a regulatory standpoint, of course, the financial services industry is under greater scrutiny than many others in terms of data privacy and security. As such, the industry in general is not as advanced when it comes to things like leveraging predictive analytics or using machine learning models, much less getting started on artificial intelligence (AI) initiatives.

But beyond that, machine learning (ML) can be a challenge because of the nature of the data in these industries, which is usually extremely imbalanced. That is, there are a lot more non-fraud happening than fraud in most cases (usually less than 1 percent of, for example, bank transactions are fraudulent), which makes building a ML model that actually catches fraud and doesn’t simply have high accuracy hard (here’s a good article with the details on why).

So yes, fraud detection is hard, but it’s so essential to businesses in these sectors that making improvements is critical for the coming years, especially as fraudsters get more and more sophisticated.

How Marlette Funding, Best Egg Loans Did It

Despite the inherent challenges to ML-based fraud detection, the team at Marlette Funding not only was able to implement a system that caught more fraud than their previous system using fewer resources, but they also used the project as a catalyst for other ML projects to add more efficiency and a more data-driven approach throughout the whole of the business.

best egg marlette logo

Marlette provides a suite of services through the Best Egg Platform for its bank partner, including capabilities that detect fraudsters - that is, people who have no intention of paying back loans. They previously built a statistical model in place that did this pretty well in catching fraud, but it resulted in lots of applicants being pushed to manual review that were not, in fact, fraudulent. This slowed the loan process for applicants and required resources to spend time reviewing non-fraud cases instead of higher-risk cases.

 

Evgeny Pogorelov

Evgeny Pogorelov, Director of Decision Science
Marlette Funding, Best Egg Loan

- Subject matter expert about the business and industry as well as the data engineering side
- Main goal: Enable the data scientists to do their jobs

 

 

Sami Bouguezzi

Sami Bouguezzi, Data Scientist
Marlette Funding, Best Egg Loan

- Primarily responsible for building machine learning algorithms for the company
- Main goal: Getting models he builds out of the sandbox and into production

 

The team - Evgeny and Sami - decided to implement their ML-based fraud detection model using Dataiku DSS from start to finish, as it would allow them to quickly get a model up and running in production and then move on to seamlessly build and deploy other high-priority projects (like pricing optimization or marketing mix optimization). Key steps in their process were:

  1. Working hand-in-hand with the fraud operations team to gather data and understand requirements and goals of the project from the business side.
  2. Gathering data available from a wide variety of sources (including their own website of course, but also from credit bureaus and vendors specialized in fraud detection data) to create one massive dataset.
  3. Using Dataiku DSS to manipulate the data, do feature engineering, and build the machine learning model (previously, the team was manually building models and using software only to deploy them, which was a challenge and resulted in delayed time to production).
  4. Benchmarking the proposed model against the current strategy, where they saw it had potential for real business impact and performed 10 percent better than the simple statistical model.
  5. Using the one-step API deployer released in Dataiku 5.0 to get the model into production and working on real data quickly for fast time-to-impact (again, this was a change from their previous process, which required weeks to deploy models in production and start to see results).
  6. Evaluating the impact to customers and cost savings to the business to show the value of the model and garner support for their work to continue developing models for other parts of the organization.

This combination of the right processes, people working together, and technology allowed Marlette Funding to develop a best-in-class ML-based fraud detection model for the Best Egg Platform with a relatively small team that not only made a difference quickly, but can easily be monitored and updated to prevent it from drifting or becoming out-of-sync with business goals and changes.

You can read more about how the team at Marlette Funding, Best Egg Loans transformed their data capabilities here.

You May Also Like

Taming LLM Outputs: Your Guide to Structured Text Generation

Read More

No-Code ML and GenAI With Dataiku and Fabric

Read More

The Objects of an LLM Mesh for Building LLM-Powered Applications

Read More

Data Lineage: The Key to Impact and Root Cause Analysis

Read More