For those who are unfamiliar, marketing attribution is the process of measuring campaign effectiveness by quantifying the influence those campaigns have on a desired outcome (e.g., starting a free trial, making a purchase, etc.). By understanding which channels or what content leads to a higher conversion rate to these desired outcomes, marketing teams can better optimize spend and messaging.
Today, in addition to there being more channels available for marketers on which to advertise, there is also more data than ever before on not only the channels, but the customers themselves and their specific habits. SalesForce did a study a few years back that said, on average, it takes six to eight touches to generate just one viable sales lead.
Marketing attribution is, therefore, the perfect space for data science, which can incorporate vast amounts of data from various sources to help marketers understand in a scalable way and down to a granular level where the best (and worst) conversions are coming from. From there, marketers can adjust spend (either manually or automatically) accordingly.
This drive to find a better way to solve one of the biggest challenges facing marketers has turned what traditionally was a business question into a data science problem, fundamentally changing the core question. It used to be, “How can I get more people to buy my product using advertising?” Now, it’s “How can I quantify the influence an advertisement has on a customer’s decision to make a purchase?” or “How can we measure the effectiveness of each advertising channel?”
A Deep Dive Into Marketing Attribution
It’s worth noting that the term marketing attribution is often used to encompass three distinctly different processes:
- Attributing offline outcomes (e.g., brick-and-mortar store purchases) to a particular campaign.
- Tracking campaigns across different user devices or different media (e.g., knowing that a particular user first saw an ad on television, then visited the website from their phone, then used a tablet to actually make the final purchase).
- Measuring the relative effectiveness of different strategies as part of a digital-only campaign across one specific device (e.g., one user on a certain device sees three different types of ads, which was the most effective?)
All three flavors of marketing attribution are quite complex (and are only becoming more so as customer experience becomes more fragmented across channels and devices), but the approach to each is different. This article focuses exclusively on the third type of marketing attribution, where data science and ML have the most direct and effective application.
Historically, marketing attribution has been a painstakingly manual process that often turns out to be more difficult (and less effective) than desired. And unfortunately, due to their relative simplicity, many marketing teams turn to single-source / single-touch attribution or other heuristic models, which are based on simple rules (like tying desired outcomes to a single source along the customer’s journey or assigning equal credit to all channels across a journey).
With rare exceptions, heuristic models for marketing attribution are a gross oversimplification and generally come with inaccuracy, especially for products and services with long sales cycles and many touches along the way since more often than not, a combination of messages could have led to the desired behavior. Heuristic models also introduce a great deal of bias; for example, last- or first-click models can place unwarranted emphasis on retargeting or Google search as effective ad targeting platforms.
OK, so if heuristic models are ineffective, what is effective? Again, this article doesn’t cover every possible approach and model, but the most popular and effective options that data scientists at Dataiku have tested with real-life customers:
Markov Chain Modeling
The output of a Markov model is the probability that a user will move from one step in the customer journey to another. Essentially, it models the customer journey, and from there, it allows marketers to answer the question: “If channel X were not present in my marketing strategy, what would be the effect on the probability of conversion?” This will ultimately give a “removal effect” for each channel, and through that, marketing teams can decide which channels are the most important.
Game Theory and Shapley Value
In using game theory for marketing attribution, one can actually model the interactions that customers have with the marketing channels as a cooperative game where each marketing channel can be seen as a player in the game, and the set of all players/channels can be thought of as working together in order to drive the conversions.
So in other words, game theory in marketing attribution assigns each touchpoint credit for a conversion based on its contribution. The Shapley value stipulates that if two players (or in this case, channels) are interchangeable, they should get the same payments (in this case, credit for conversion). And if a channel doesn’t add any value to all the coalitions (in marketing attribution, combinations of actions in a user journey), that channel should get the conversion credit that it generates alone.
How to Execute a Marketing Attribution Project
If you’re familiar with the seven fundamental steps to building a data project, then you already know the basics for how to get started using ML to the benefit of your marketing team. But there are also several particularities to bear in mind when working with marketing attribution.
1. Understand the Business
As with any data science project, marketing attribution must begin on the business side. Before diving into the data, the team needs to take a step back and answer the following questions (preferably with business/marketing and data teams together):
How are we currently doing marketing attribution? Of course, before starting a new project, it’s important to understand what teams are already doing (or have already tried to do) to address the question of channel attribution. Every member of the team tackling marketing attribution should know how it’s being done right now, why it’s being done that way, how it works, the results it’s delivering, and who is using those results (as well as how they’re using them). This will provide a more clear picture of needs.
How many different types of campaigns do we have, and what is the desired action for each campaign (or campaign type)? For some businesses, or for some particular campaigns, the desired action might be making a purchase. For others, it might be more awareness-based, so a potential customer simply visiting the website would be considered the goal action. In any case, the desired action — or goal — for each marketing campaign must be defined, and it should be specific. Different attribution models might work better or worse with certain campaigns, so mapping this out clearly before getting started is critical.
What is the ideal way to deliver results that will have real business impact? In other words, what is the deliverable? Whether it’s a dashboard or real-time, automated campaign spend allocation, failing to define deliverables before kicking off a marketing attribution project sets the stage for failure (especially when data scientists and marketing teams aren’t aligned and the result is something the marketing team can’t make use of).
2. Get Your Data
Coming up with a good data science solution for a business question starts with properly scoping out the business needs, but once that’s finished, the second most essential component is good data.
The first step is to map out all channels and touchpoints along the customer journey to be sure that no channels are forgotten. From there, good data means, of course, the prerequisite tracking of all user actions on each targeted channel.
But moreover, it means understanding exactly what data is attached to each touchpoint and where the data comes from as well as what limitations (e.g., missing data) might exist. Understanding attribution data is not only fundamental to the accuracy of models, but it’s also essential for business teams and leaders to trust model outcomes.
3. Prepare Data
After identifying all the right data sources, no matter what algorithm is ultimately chosen for the attribution, the next step in all cases is to ensure the data is clean and in the right format. This requires, among other things, that the user sessions be constructed and well defined. It is at this point in the process that one may discover channels where data is missing altogether.
Should this process uncover holes in the data (like missing tags for certain channels, for example) the best approach is to stop and address the problem. It’s not possible to build an accurate attribution model with missing data, so taking the time to fix the issue to ensure data is attributed properly before moving forward is critical.
4. Explore, Clean, and Enrich Data
It is at this point that it’s necessary to define the model that will be used for the project, as all subsequent steps of working with the data depend on which model is being used for attribution.
With marketing attribution, trying multiple approaches in parallel is not practical or recommended (though if you’re currently not doing any marketing attribution, it’s a good idea to use one of the simple heuristic models first to get a baseline idea of what the start of the customer funnel looks like).
Unlike other types of machine learning models (like, for example, churn, predictive maintenance, or anomaly detection) where it’s possible to split data into train and test sets to compare the model’s predictions to actual outcomes, the only way to actually test a marketing attribution model is to use it. Unlike these other models, marketing attribution isn’t a true predictive model, so there are no “actual” outcomes with which to compare before making the model live.
5. Get “Predictive”
Traditionally in a data project, once data is clean and prepared, predictive models can be applied. In the case of marketing attribution, nothing is actually being predicted (hence the title of this section as get “predictive”). Instead, the outcome of the model will be a percentage or score for each channel.
6. Visualize
Of course, visualizations can be useful when it comes to marketing attribution to illustrate the distribution of the conversions for the channels themselves. This might be a bar chart showing conversions (or percentage of conversions) per channel for all time. Or it could be a line chart showing conversions per channel over time, which can be useful to see if there is fluctuation. Fluctuation could either indicate seasonality or, more likely, that the algorithm is unstable, which is a good sign that iteration is necessary.
7. Deploy and Iterate
Deploying a marketing attribution project can mean any number of things depending on the predefined deliverables with the business and marketing teams. But at a very minimum, it means having a model working on actual data and updating regularly based on current data (again, this should have been pre-defined in the deliverables agreed up with marketing — depending on their needs and the nature of the business, it could be daily, weekly, monthly, etc.).
Marketing attribution is unique as a data science project in that the only way to see its effects is to deploy the model, update marketing spend accordingly, and observe the change on the business side. In other words, based on the model and adjusting spend, look at the number of conversions — how did allocating less budget to a specific channel affect those conversions overall?
By repeating this process for different channels and measuring the resulting business outcome, marketing teams will be able to identify the optimal balance.
Putting It All Together
Attributing advertising channel conversions is perhaps the biggest — yet also most complex — challenge that today’s marketing teams face. And there is no magic bullet solution; though employing data science and ML techniques can significantly lower the time spent and deliver better results than traditional heuristic models, it’s still not a one-and-done deal. Marketing teams must continuously evaluate channels, and the use of those channels, at regular intervals to understand and address shifts in consumer behavior over time.