Split test the smart way: mitigate cognitive bias through confidence testing

A cognitive bias is a common tendency to process information by filtering it through one’s own likes, dislikes, and experiences. In this article I will explain how cognitive biases affect our ability to judge the results of a split test, and show you an easy way to avoid making incorrect decisions based on misleading data.

Why confidence is everything for digital advertisers

The beauty of digital advertising lies in its quantifiable nature. The ability to attribute real results to an advertising investment allows digital advertisers to make qualified decisions when assigning budget and defining strategy. Measuring the results of digital marketing campaigns helps us drive traffic into areas which provide the best return on investment, highest conversion rate, or lowest cost per acquisition depending on the objective.

Having all this data is a blessing, but you must be careful not to make strategic decisions based on statistically insignificant data. You may appear to have identified a clear winner in your ad creative or landing page split test, but how can you be sure that your analysis of the results isn’t skewed by your cognitive biases? How can you be sure that the results aren’t due to chance alone?

You are inherently biased, which makes for poor decision making

When it comes to analysing the results of your split test experiments, a whole host of cognitive biases come into play. Most notably for split testing, confirmation bias will make you more likely to accept results which support your hypothesis. Take the split test results below – which ad variant do you think is the winner?

Ad Variant

Impressions

Clicks

Click Through Rate

Variant A

1,520

75

4.93%

Variant B

1,357

76

5.60%

Variant B has a click through rate of 5.6%. This is quite a bit higher than variant A’s click through rate of 4.93%. The sample size is also relatively large, with over 1000 impressions for each ad. At first glance it seems that variant B is the clear winner, but in reality no clear winner can be identified from these results. Ending the test now and declaring variant B winner would be a poor decision.

Asking the right questions

We’ve determined that human nature affects our ability to judge the significance of test results, now we need to know how to safeguard against this, and the answer lies in asking the right questions. Let’s start from the beginning of the split testing process.

Question 1
“What am I going to split test?”

Ad creative split testing is a method of statistical hypothesis testing. When creating an ad split test, you should always start with a hypothesis. This might look something like “I predict that adding a time-sensitive call to action will increase click through rate”. This is known as your alternate hypothesis.

Acting on our alternate hypothesis, we might decide to test two ad variants:

Ad variant A: Contains call to action “Buy Online!”

Ad variant B: Contains call to action “Buy Online Now!”

Question 2
“Which ad variant gets clicked the most?”

To answer this we rotate the ad variants, serving them evenly to our target audience, and measure their click through rate. Let’s assume that the answer to our first question is “yes” and that one ad has a higher click through rate then the other.

Question 3
“Could the difference in click through rate be due to chance alone?”

To answer this question, we need to test our alternate hypothesis against the null hypothesis which always states that, “click through rate is the same for both ad variants, and any difference in reported click through rate is due to chance alone.” We do this by calculating the confidence rating of our results.

Getting technical – Confidence rating & P-Value

Calculating the confidence rating of our split test results requires some rather advanced mathematics. Fortunately tools exist to perform these calculations automatically. Essentially, these tools work by determining the results P-value. The P-Value represents the probability of observing a result at least as extreme as the observed result if the null hypothesis is true.  In ad creative split testing the ‘observed result’ is the difference between variant A and variant B’s click through rate.

In other words, if the P-value is low, the difference in conversion rate between ad variant A and ad variant B is not likely to be due to chance alone. Instead, the difference is likely due to our alternate hypothesis; that the time-sensitive call to action in variant B resulted in increased click through rate.

Let’s take another look at our earlier example of a split test result:

Ad Variant 

Impressions

Clicks

Click Through Rate

Variant A

1,520

75

4.93%

Variant B

1,357

76

5.60%

Using a statistical significance calculator we can determine the confidence rating for the results in our example above. In this case it is: 57.63%

We can therefore only be 57.63% confident that variant B’s higher click through rate is not due to chance. To make a confident decision on variant B winning the test, we need a confidence rating above 95%.

A confidence rating below 95% does not prove the null hypothesis, it just means that we don’t have a large enough sample size to disprove it. In this case, we need to continue testing until we have a confidence value of 95% or higher.

Note: if your ads are too similar, you may never reach a confidence rating above 95%. If your click through rates remain very similar even with a large sample size (many impressions), you may need to start a new test with less similar ad variants.

Henry Carless profile picture
Henry Carless

With over 12 years’ experience in search engine marketing, Henry has extensive knowledge of paid search strategy and delivery. He has managed advertising campaigns for companies of all sizes over a wide range of industries, from start-ups to multinational partnerships with seven figure PPC budgets. Henry joined Vertical Leap in 2012 as a PPC specialist, managing and delivering campaigns for clients and has since become one of our Data Scientists. He has what could be considered an obsession with data analysis; measuring and tracking everything he can in order to fully understand how our adverts perform. In his spare time, Henry enjoys 3D printing, Dungeons & Dragons, home brewing, and golf.

More articles by Henry
Related articles
The ultimate SEO glossary

The ultimate SEO glossary

By Lee Wilson
Clarity of vision looking through glasses at city skyline

Data-driven attribution modelling explained (part 2)

By George Karapalidis
Travel UX issues

5 ways to improve travel UX with Google Analytics

By Wez Maynard
Advanced geotargeting for the waste industry in Google Ads

Advanced geotargeting for the waste industry in Google Ads

By Bernie Thomas
Not Found Image

The digital marketer’s big data dilemma

By Graeme Parton
woman on the phone taking orders

How to generate SEO leads

By Sally Newman