Let’s Talk Attribution: 3 Key Flaws

This is part two of a three part blog post series, transcribing talks I gave at multiple conferences in 2017/18.  Please read part one of the series to get the origin of the talks, the business problem it focuses on and a review of different attribution models using a football (soccer) analogy. 

Then read on here for my description of the three key flaws that impact the very concept of attribution models, meaning you should look for alternative approaches to optimise your marketing campaigns.

Flaws in the Concept of Attribution Models (slides 27 – 43)

As I said at the start of the first post, I disagree with attribution and have done so for years. To me, there are three key flaws with the very concept of attribution modelling:

  1. Customer journey mapping doesn’t include all touchpoints
  2. Attribution models are based on allocating 100% of revenue/conversions to all (known) touchpoints
  3. Attribution models are built off historical data

There are definitely more flaws than this but I had limited time in my talks. The rest of this blog post goes into each of these flaws in more detail.

1.    Missing Touchpoints (slides 29 – 36)

Screenshots from attribution tools detailing the need for every customer touchpoint

Looking at the descriptions of data driven tools online, you will get similar stories. They may have different algorithms and are trying to give life to their own versions of Skynet but the basic principles are the same. One of these basic principles is that the tool needs every customer touchpoint in their journey to be recorded so that they can statistically calculate the impact of each touchpoint and thus assign value to each.

It is just not possible to capture every customer touchpoint.

Multiple Device Use

Let us return to our football analogy. Through each scenario, we had three players involved in the passage of play that led to the goal that was scored. The midfielder ran with the ball, passed to the winger, who crossed to the striker to score. But what if you had only been seeing the play on half the pitch and therefore only half of this passage of play.

What actually happened was the midfielder had received the ball from a defender. That defender got the ball via a throw in from another defender. Therefore there were five players who had touched the ball, not just three. In assigning credit for the goal, all five players need to be considered, not just the three players that you were able to see as only watching the front half of the pitch. Otherwise you would just invest in players who are great at turning possession into goals but who lose every game as there are no players to get the ball in the first place (or to defend).

Football analogy extended to both sides of the pitch

In the same way, unless visitors are logged in during every visit, you can only see their behaviour on a single device. If they used a home computer to make the purchase but a work computer (or smartphone or any other device) to research the product, these vital research touchpoints would not be recorded as part of that complete customer journey.

Not investing correctly in these top of the funnel marketing channels could mean fewer people becoming aware of your products, leading to less conversions for the bottom of funnel campaigns.

Offline Touchpoints

Reviewing that passage of play in more detail, while five players touched the ball before the goal was scored, there were actually other players involved as well. One player forced the mistake from the opposition that led to the ball going out and the subsequent throw-in. Without that defensive effort, there would have been no goal. Other players were running in support for the attack. They didn’t get the ball but they drew away the opposition defenders to allow the winger and striker the space they needed to make the cross and score the goal.

Football analogy including players not involved in passage of play

In the same way, online experiences may only be part of the customer decision process.  Capturing only the touchpoints that use an online device means understanding only part of the story. There could be offline advertising on TV or press that got the future customer thinking. They might have had a chat with their neighbour over the back fence and received a recommendation from them. Those touchpoints, those influences, are critical in the purchase decision and attribution tool do not capture them.

So back to these attribution tools and their self-defined need to capture, map and take into account every touchpoint across the entire customer journey. As it is not possible to capture every customer touchpoint, this invalidates the output of these data driven attribution tools based on their own logic. It is like attempting to calculate the contribution of players to goals scored, without knowing that you were only seeing half the team.

Simple example

A (very) simplistic example to illustrate this:

  • An online retailer has a high proportion of visitors researching their potential purchase
    • 80% of customers make at least one research visit prior to their purchase
    • 20% purchase on their first visit
  • 75% of these researchers do so at work (without logging in) before purchasing at home on a different device
    • 60% research on work device, purchase on home device
    • 20% research on home device, purchase on home device
    • 20% purchase on first visit (home or work device)
  • Given no login and due to different devices, tools will all say:
    • 80% purchase on first visit
    • 20% purchase on subsequent visit after research

The business strategy that will be implemented based on this data, showing 80% of visitors purchase on their first visit, will be very different to the strategy that would be implemented if the truth was known, that 80% of visitors research pre purchase.

Not having the true complete customer journey breaks attribution models and can lead to the wrong strategies being implemented.

Side point – But all Digital Analytics data is wrong, this is just more of the same!

Yes, we know that all Digital Analytics data is inaccurate. It doesn’t track some users, some hits aren’t fired, the implementation is wrong. But that is all ok as the sample captured reflects the population behaviour. Therefore, the actions taken based on this data, the strategies you devise, are still correct.

That is not necessarily correct for the output from attribution models/tools.

2.    The True Impact of Marketing Campaigns (slides 37 – 40)

An assumption behind all attribution modelling is that all sales/conversions must be due to marketing activity. All revenue received by a company needs to be allocated to the marketing channels or campaigns that contributed to the generation of that revenue. This is another flawed assumption, easily dispelled through a couple of simple examples, provided by Gary Angel in a blog post back in 2014 entitled “Don’t give me no stinking credit: re-thinking digital attribution”.

The first example deals with a motors dealership. They do research into their customers and discover a website that 20% of customers view prior to making a purchase with them. Based on this research, they place ads on the website. The data shows great results, with 20% of sales occurring after viewing one of these ads.

The second example is for a company that offers annual subscriptions, with very high retention rates, typically 85% of customers renew their subscription. For the first time, this company implements an email programme, messaging their existing customer database. Most customers open these emails and have no other marketing touchpoints with the organisation. Following the launch of this email programme, the data shows that 85% of customers that receive an email renew their subscription.

For both these examples, how much credit should the display campaign for the motors dealership and the email programme for the subscription company receive. The data will say 20% and 85% respectively.

Attribution modelling includes all (known) marketing touchpoints prior to a purchase in their allocation of revenue. Due to that, in both these scenarios, these campaigns would get a lot of credit. But based on what we know, the two campaigns actually had zero impact, these sales would have happened without the marketing efforts. They were only correlated with the sales, they didn’t cause the sales.

cartoon with a joke about correlation vs causation

An example I heard once was to imagine three groups of prospects. The first will never buy from you, the second may buy from you and the final group will definitely buy from you. If you run a marketing campaign to all three groups initially, to learn their behaviour, the data will recommend investing all your money in the third group. Based on the data, targeting this group will deliver the higher conversion rate and ROI for your spend. However, as this group was always going to purchase, that marketing money is wasted. To maximise revenue, you need to invest the budget in the second group, the only group where you can influence purchase behaviour.

Every business has (hopefully) an underlying loyal customer base. These customers will purchase from you even if you switch off all marketing spend. It may be 10% of your sales, it may be 80%, that % is going to vary but it will exist.

The purpose of marketing efforts is to drive incremental revenue, on top of this loyal customer revenue. Accordingly, marketing campaigns should only receive credit for this incremental revenue, not total revenue. Doing otherwise inflates the ROI calculations and leads to unprofitable marketing spend.

3.    Historical Data that is Out of Date (slides 41 – 42)

The maths and machine learning behind data driven attribution models is incredibly impressive, that is beyond doubt. But all models are dependent on the information that is fed into them and as the classic saying goes “garbage in, garbage out”. Putting the whole issue of missing touchpoints to one side, assume that all the information fed into the models is of the highest possible quality. That is great.

But then the world changes…

If these models can’t be adjusted, manually, based on factors known by the business stakeholders but without any historical data points, the output will be out of date. Out of date information is useless in making decisions.

Examples of factors that could change, known to business stakeholders (who would be able to estimate the impact) but without historical data that can be fed into the model include:

New product range – when you launch a new product range, you have an idea of how it should perform. You know which marketing channels should work best (or at least a best guess at this). But there is no historical data to back you up, unless you are working with a tool that allows you to say “these products are like those products”.

New marketing campaign – a factor in marketing across channels is the halo effect, where marketing in one channel increases performance in a different channel. So, starting marketing in a new channel, or even launching a new campaign, can impact the performance of other campaigns (positively or negatively). There is no data for this historically but all the marketing performance numbers are about to change…

man moving levers on a machine

Change to marketing campaign – increase/decrease budgets, change creatives/messaging, etc. These are tactical/strategic changes made by organisations to impact business performance. The impact should be predictable (rightly or wrongly) by the business stakeholder. But attribution models need to wait for sufficient historical data to flow through before it can account for these changes in its output and recommended actions.

Competitor strategies – the above but actioned by competitors. They launch a new competing premium brand and all your marketing is going to perform differently overnight. Someone else starts bidding on the same search terms as you for the first time and your costs are going up while your ROI is going down. You can react faster than the attribution models can.

New social media platform – who could have predicted the impact of Facebook on marketing efforts and spend 10 or even 5 years ago? Marketing budgets have been changing in more recent times due to TikTok. The next big social media platform may not have been launched yet but getting ahead of the game with it could have a massive impact on your sales (if the audience is relevant).

External factors – ok so it wasn’t an example I used when giving these talks but I couldn’t conceive of a pandemic causing countries to lockdown back then. The world changed, we are in a new normal and this new normal is constantly changing. There is no historical data for this.

The Three Key Flaws with Attribution Models (slide 43)

To summarise…

  • Data driven attribution models need all customer touchpoints for the maths to work correctly.
    • It is not possible to capture all customer touchpoints
    • Therefore the maths cannot work correctly
  • Attribution modelling allocate 100% of revenue/conversions to marketing channels
    • A proportion of revenue/conversions happen without being caused by marketing
    • Therefore the revenue/conversions attributed to marketing channels is inflated
  • Attribution modelling is based on historical data
    • When life changes, historical data is not a predictor of future customer behaviour
    • Therefore the attribution modelling cannot allocate future spend by channel or campaign in a reliably optimal way

We need tools that allow us to predict the future, not tools that explain the past.

Part three of this blog post series will follow next week. It details an alternative approach to optimising marketing efforts, to replace the current focus on finding the right attribution model.