We started forecasting several European leagues a while back, both final standings and upcoming matches. The forecasts are based on our world team ranking and should help to evaluate the quality of the ranking more objectively. The better the predictions, the better our ranking assesses actual team strength (at least theoretically). This article provides an analysis of the season 2017/18. Technical details of our prediction model will soon be available in the method section.
Evaluating Match Forecasts
There are many possibilities to evaluate the accuracy of our predictions, but
we ultimately settle here for the Ranked Probability Score (RPS). The speciality of RPS is that
it assumes match outcomes (home win/draw/away win) to be ordered. That is, a home win is closer
to a draw than an away win.
If the forecast is perfectly accurate, we obtain a score of 0. This implies that we always predicted a probability of 1 for the correct outcome. The higher the score, the further away we are from this perfect forecast. To get an idea of the "order effect" of RPS: A model that predicts (0.5/0.3/0.2) has an RPS of 0.145, compared to an RPS of 0.170 for a model that predicted (0.5/0.2/0.3).
Intuitively, the first model was closer to the real outcome since it predicted that the home team
does not loose with 0.8 compared to 0.7 for the second model.
Match Forecasting ability of the World Ranking
Calculating the RPS of our model alone is not enough to see if we performed well. We
need to compare it to others. We choose two models as comparison, a worst-case and
The worst-case model assigns uniform probabilities to the outcomes home win, draw
and away win (0.333/0.333/0.333). This gives us the upper bound for RPS scores. Our
model should really not perform worse than that, otherwise, even randomly guessing
the outcome is better than our world ranking.
More interesting is how our world ranking compares to a best-case model. A best-case model
can be constructed from betting odds, assuming that bookie's (should) have the best
knowledge. Otherwise, they would loose money. We obtained the
odds for last season from football-data.co.uk for our
six considered leagues and calculated the RPS scores. The figure below
shows how our world ranking compares to these two models.
The exact values can be found in the following table.
The performance is surprisingly good! We were better than a random better in all leagues,
but of course that should not be the real benchmark. Compared to betting odds, the world
ranking performed really bad for the Swiss Super League and the German Bundesliga.
In the remaining leagues, however, we are close to the (potentially) optimal RPS.
For the Serie A and Ligue 1, we even achieved said optimum.
Of course it would also be of interest, how we compare to other Pundits.
a series of public prediction models for the Premier League starting at GW 9. For
this time window, we achieve an RPS of 0.186, which would put our world ranking on
the 16th position of 31 models. So fairly average.
Forecasting Final Tables
Together with our match forecasts, we also published a prediction for the final standings
of leagues. These were based on the average outcome of 1000 simulations of the remaining matches.
The below chart shows the average deviation of points for our predictions compared
to the final standing.
Naturally, the closer to the end of the season, the smaller becomes the error.
Though the final error for the Swiss Super League is still almost three points per team.
The lowest overall error was obtained for the Premier League where, at the end, we were off by
We can also neglect the points and just check how close we were to having the correct
final ranking. After all, it does not really matter if a team has 80 or 83 points when
they would finish first either way. The chart below shows the correlation of our predicted
rankings with the final ranking.
Although we had the biggest error for the Swiss Super League in terms of points,
we achieved the best predictions for the final ranking in our last forecast. In the beginning
though, we were quite off again. The most stable predictions were obtained for the Serie A, where
we were always close to a correlation of 0.9