With the group stages of the football’s men’s World Cup in Qatar now complete, the pathways to the final for the remaining 16 teams are clear. ‘Scorecasting’ economists make Brazil favourites to win the tournament with a probability of 28%, while England’s chance of bringing home the trophy is 8.7%.
The first phase of football’s men’s World Cup has been completed after plenty of drama and excitement. Two of the more favoured countries – Belgium and Germany – have already gone out of the tournament, with another – Spain – surviving a qualification scare.
Now that the initial 32 teams have been reduced to 16, it is a good time to revise forecasts made at the outset of the tournament. The elimination of some of the bigger teams, allied with the predetermined structure of the draw, means that much more is known now about how the tournament will play out.
The sense of anticipation of events yet to occur drives a large part of the interest in an event like the World Cup. Research using Google searches for events in which the winner is often known in advance of the schedule being complete give credence to this: once the winner is known, search volumes drop.
How can we best make predictions? Predictions can be as basic as gut instinct – or judgemental forecasts, as they are often referred to in statistical research – or they can be based on established statistical methods.
Bookmakers use statistical models to set the odds, for example, whereas ex-footballer Chris Sutton on BBC Sport makes judgemental forecasts. Here, we introduce a simulation-based statistical method to make forecasts. We simulated the match outcomes of each of the 64 matches of the World Cup, and looked at what might happen.
Our method for simulation involves drawing a random number from a statistical distribution – a Poisson distribution, well known for approximating variables that are counts, such as the number of doctor visits, the number of workplace absences or the number of goals. The distribution has a mean (average) – the expected number of goals a team will score – and a variance – how varied that number is.
The idea is that scorelines generated (for example, 3-1, 2-4 or 1-1) are such that, on average, they look like actual, plausible football scores. The most common scoreline in the history of football is a 1-1 draw (about 11% of the hundreds of thousands of matches stored on www.soccerbase.com finish 1-1), with a 1-0 home win the second most likely. But on average, the home team scored 1.7 goals, and the away team 1.2, and hence rounding, a 2-1 scoreline is also relatively common (about 9% of all matches).
We model the expected number of goals that a team will score as being dependent on how strong they are and how strong their opposition is. We measure how strong the opposition is based on what are called Elo ratings. These have been used to model football match outcomes, as well as other sports.
It is convention to begin Elo ratings at 1000 when a team plays their first ever match, and then update after each match by a function of the team strengths of the two teams involved. The adjustment factor can be made bigger (more noisy ratings) or smaller (less quickly reflecting changing strengths of teams).
To create these ratings, a conservative adjustment factor of 20 has been used. Spain are the second best team after Brazil, followed by Argentina and France.
Outside this top four is a cluster of European nations: England, Germany, Netherlands, Belgium and Portugal. With a few exceptions, these rankings are the same as those constructed by the website www.eloratings.net, which increases the adjustment factor for more important matches, up to 60 for matches in World Cup finals.
When Brazil, with an Elo rating of 1400, faces Cameroon, who have an Elo rating of 1130, they would be expected to win. Indeed, the Elo prediction would be around 0.83, implying that Brazil has an 83% chance of winning (disregarding the possibility of a draw) – so the group stage defeat by Cameroon in this World Cup was highly unexpected. If Brazil faces Spain, the prediction is 0.57 or 57%.
Our simulation method draws goals at random for each team according to the strengths of the two teams playing, and thus generates entire scenarios for the World Cup. Group outcomes can be inferred from the results simulated, and the routes to progression to the latter stages. We employ a correction for the home advantage that is well known in sport and international football.
Table 1 lists the chances of each country making each knock-out stage of the tournament, and ultimately winning the competition, as calculated at the outset of the tournament. Brazil were almost certain to reach the last 16 (95%), as were Argentina (87%). Belgium (79%) and Germany (71%) are the two most surprising exits from the tournament thus far.
Table 1: Pre-tournament probabilities of teams reaching each stage of World Cup 2022
Country | Last 16 | Quarter finalists | Semi finalists | Finalists | Winners |
Brazil | 95 | 76 | 52 | 38 | 27 |
Spain | 81 | 59 | 34 | 22 | 13 |
Argentina | 87 | 59 | 39 | 20 | 11 |
France | 84 | 55 | 36 | 20 | 10 |
England | 77 | 51 | 28 | 14 | 6 |
Germany | 71 | 45 | 24 | 13 | 6 |
Netherlands | 86 | 51 | 27 | 12 | 6 |
Belgium | 79 | 39 | 21 | 11 | 5 |
Portugal | 76 | 41 | 20 | 10 | 4 |
Iran | 59 | 33 | 15 | 6 | 2 |
Mexico | 58 | 26 | 11 | 4 | 1 |
Croatia | 58 | 20 | 8 | 3 | 1 |
Uruguay | 52 | 21 | 8 | 3 | 1 |
South Korea | 54 | 21 | 8 | 3 | 1 |
USA | 49 | 25 | 11 | 3 | 1 |
Japan | 38 | 18 | 7 | 3 | 1 |
Denmark | 53 | 22 | 9 | 3 | 1 |
Switzerland | 50 | 22 | 8 | 3 | 1 |
Morocco | 46 | 14 | 4 | 1 | 0 |
Australia | 40 | 15 | 6 | 2 | 0 |
Senegal | 52 | 19 | 6 | 2 | 0 |
Poland | 35 | 11 | 4 | 1 | 0 |
Serbia | 34 | 12 | 3 | 1 | 0 |
Tunisia | 24 | 7 | 2 | 0 | 0 |
Ecuador | 36 | 10 | 3 | 1 | 0 |
Saudi Arabia | 21 | 5 | 1 | 0 | 0 |
Cameroon | 21 | 6 | 1 | 0 | 0 |
Qatar | 26 | 6 | 2 | 0 | 0 |
Costa Rica | 10 | 3 | 1 | 0 | 0 |
Wales | 15 | 4 | 1 | 0 | 0 |
Ghana | 18 | 3 | 1 | 0 | 0 |
Canada | 17 | 3 | 1 | 0 | 0 |
Now that the group stages have been completed, we can update these numbers, and we present the new calculations in Table 2. Here, we have simulated the tournament from hereon in. A lot of uncertainty is removed now that there are fewer potential winners and the pathways for each country to the final is much clearer.
Table 2: Probability of reaching each stage of World Cup 2022, by team in the last 16
Country | Quarter-finalists | Semi-finalists | Finalists | Winners |
Brazil | 79.9 | 63.7 | 42.7 | 28.0 |
Spain | 75.2 | 49.9 | 29.4 | 15.3 |
France | 77.6 | 46.7 | 25.3 | 12.5 |
Argentina | 75.1 | 46.1 | 23.0 | 12.5 |
England | 71.7 | 37.1 | 18.9 | 8.7 |
Netherlands | 63.4 | 30.8 | 13.7 | 6.7 |
Portugal | 65.5 | 29.2 | 14.4 | 6.0 |
Croatia | 51.0 | 13.5 | 5.2 | 1.8 |
Japan | 49.0 | 12.8 | 5.0 | 1.6 |
USA | 36.6 | 14.1 | 4.2 | 1.6 |
South Korea | 20.1 | 10.0 | 3.8 | 1.3 |
Morocco | 24.8 | 10.1 | 3.5 | 1.1 |
Switzerland | 34.5 | 10.8 | 3.6 | 1.0 |
Australia | 24.9 | 9.0 | 2.5 | 0.8 |
Senegal | 28.3 | 9.3 | 2.8 | 0.7 |
Poland | 22.4 | 7.0 | 2.2 | 0.5 |
Despite teams with a combined 16% chance of winning the World Cup being eliminated, and despite knowledge of how Brazil will reach the final, their probability of winning is only slightly changed, at 28% (up from 27%). Spain also jump – from 13% to 15%; Argentina from 11% to 12.5%; and France from 10% to 12.5%.
It is perhaps not totally surprising that the probabilities haven’t changed much; after all, none of the four most likely countries emerged from their groups unbeaten. No team won all three group stage matches, and as such no team looks unbeatable… yet.
The probabilities here, with South Korea having a 20% chance of making the quarter-finals despite facing the daunting task of playing Brazil, tell us that we should expect, at a minimum, unexpected results in at least one in five of the Round of 16 matches, hence around three of them. If any of South Korea, Australia, Poland or Morocco win their matches at this stage, it makes the passage of their quarter-final opponents, on paper, to the semi-finals a little bit easier.
Where can I find out more?
Evaluating strange forecasts: The curious case of football match scorelines – study by James Reade and colleagues.
Going with your gut: The (in)accuracy of forecast revisions in a football score prediction game – another research paper by James Reade and colleagues.
Futbolmetrix’s blog: Discussion of football (futbol, calcio, soccer) and numbers.
Who are experts on this question?
- Alex Krumer, Molde University College
- James Reade, University of Reading
- Carl Singleton, University of Reading
- Simon Gleave, Gracenote
- Daniele Paserman, Boston University
Author: James Reade
Photo by Rhett Lewis for Unsplash