Football managers like to say it’s a game of two halves; eleven men versus eleven men where anything can happen. But if that’s the case, why are teams like Brazil and Germany consistent winners of tournaments? What makes teams like Spain, Italy and France continual favourites?
We believe the answer lies in data. With enough data points and enough analysis of those data points, it is perhaps possible to predict the winner of the World Cup based on data alone.
We decided to give it a go, but only on a small scale. We’ll explain our process below, and you can play with the data yourself to draw your own conclusions.
Up until 2000, the only team to have won a World Cup outside its own continent was Brazil. They won in Sweden in 1954 and again in the USA in 1994 – although Brazil is effectively in the same hemisphere as USA, practically the same time zone, and they only won on penalties after a bore draw final, thanks to Roberto Baggio uncharacteristically missing a penalty. In the 21st century, three of the last four World Cups have been won by teams outside of their own continent – Germany won in Brazil, Spain won in South Africa and Brazil won in Japan.
But, of all the times the competition has been hosted in Europe, only once has a non-European country won.
European teams are on a roll – they are the ones that have been winning outside of their own continent (Spain and Germany, as I’ve mentioned).
2014 World Cup winners Germany have not enjoyed a great build up to this year’s World Cup in Russia, winning only once in the last six games – and that was against Saudi Arabia. Previous campaigns in the lead up to a World Cup have gone much smoother for the Germans. Will a lacklustre run of games have a negative effect on the reigning champions or is it going to be business as usual for Joachim Löw?
Using data on all the caps across each 23-man squad, and all their international goals, we can calculate the average goals per cap across each team. Using this metric, Argentina’s current squad has the best goal-scoring record over time, whereas Egypt has the worst.
This includes four from Brazil who play at Manchester City together. Manchester City have supplied the most players for this year’s tournament with 16 of their squad travelling to Russia.
The home team’s goal-scoring average at World Cups has been declining, and they are currently 70th in the FIFA/Coca-Cola World Rankings. Will a home advantage give Russia a competitive edge this time?
Three of Iceland’s team play club football in Russia, as do three from Iran, two from Sweden and one from Poland. We gave proximity points to each player, based on their home club’s proximity to Russia. This helps us calculate a home advantage score for each team.
To build a model capable of accurately predicting the winner, we would need big data. The result of every match played by every team, not only at World Cup Finals, but also qualification games, friendlies and matches at other tournaments.
We would need to take into account things like location of those games, how far the teams travelled, how far into the season the games took place, perhaps the ranking of the opponents.
We would also need big data relating to all the players in all the games. Which players were in the games? How was their form at the time? How had those players performed historically before every match?
The location of a World Cup Finals tournament could also affect team performance – how far they travel to each match, how far they are away from home, how many days there are between matches.
Having every available data point would require some algorithmic analysis, and machine learning to develop models for predicting outcomes. We’d ideally test the results of our model’s predictions, so we could train it to get better over time.
We don’t have all that data, and we don’t have time to train a model to the extent we would like, but nevertheless we decided to do some analysis on a smaller scale and have some fun with data visualisation and insight.
We chose to focus only on historic matches played in World Cup Finals by the teams competing in this year’s World Cup. This means we have some dark horses. Iceland and Panama have never qualified before, so they have no history of matches played. Logic would suggest that a first timer is unlikely to win anyway – other than Uruguay, who won the first ever tournament in 1934 on home soil.
Use our interactive report to see if you can predict a winner based on the available data.
George joined Vertical Leap in 2017 after working in-house in digital marketing teams, working across various digital platforms. He studied Music & Media Technologies and pursued a love of film by creating a few short films whilst travelling through Asia and Europe, working freelance on the side.
George developed his knowledge of digital products and services in different industries before joining Vertical Leap to specialise in SEO. Working in-house allowed him to develop ways to discuss complex situations in a way that even the biggest technophobes can understand. George has also recently taken on a new role in our Performance UX team, splitting his time between the two specialisms.
Living in Portsmouth his whole life, George enjoys everything it has to offer. In his free time George enjoys nothing more than sitting down to watch the football with family dog Bruno supporting Manchester United.
Categories: Machine Learning, Martech
Categories: PPC, SEO
If your digital campaigns are underperforming, our commitment-free health check will reveal powerful insights to help you improve performance.