Anthony's model Dolores was developed for the International Machine Learning for Soccer Competition hosted by the Machine Learning journal.
All participants were provided with the results of matches from 52 different leagues around the world - with some missing data as part of the challenge. They had to produce a single model before the end of March 2017 that would be tested on its accuracy of predicting 206 future match outcomes from 26 different leagues, played from March 31 to April 9 in 2017.
Dolores was ranked 2nd with a predictive accuracy almost the same as the top ranked system (there was less than 1% error rate difference between the two; the error rate was nearly 120% lower than the participants ranked lowest among those that passed the basic criteria).
Dolores is designed to predict football match outcomes in one country by observing football matches in multiple other countries.It is based on a) dynamic ratings and b) Hybrid Bayesian Networks.
Unlike past academic literature which tends to focus on a single league or tournament, Dolores provides empirical proof that a model can make a good prediction for a match outcome between teams 𝑥 and 𝑦 even when the prediction is derived from historical match data that neither 𝑥 nor 𝑦 participated in. This implies that we can still predict, for example, the outcome of English Premier League matches, based on training data from Japan, New Zealand, Mexico, South Africa, Russia, and other countries in addition to data from the English Premier league.
The Machine Learning journal has published the descriptions of the highest ranked systems in its latest issue published online today. The full reference for Anthony's paper is:
Constantinou, A. (2018). Dolores: A model that predicts football match outcomes from all over the world. Machine Learning, 1-27, DOI: https://doi.org/10.1007/s10994-018-5703-7
The full published version can be viewed (for free) at https://rdcu.be/Nntp. An open access pre-publication version (pdf format) is available for download here.
This work was partly supported by the European Research Council (ERC), research project ERC-2013-AdG339182-BAYES_KNOWLEDGE
The DOLORES Hybrid Bayesian Network was built and run using the AgenaRisk software.
The full reference for the pi-ratings model (used by the competition's winning team) is:
Constantinou, A. C. & Fenton, N. E. (2013). Determining the level of ability of football teams by dynamic ratings based on the relative discrepancies in scores between adversaries. Journal of Quantitative Analysis in Sports. Vol. 9, Iss. 1, 37–50. DOI: http://dx.doi.org/10.1515/jqas-2012-0036See also:
Open access version here.
- Anthony's pi-football website.
- Explaining and predicting football results over an entire season
- Explaining Bayesian networks through a football management problem
- The problem with predicting football results
- A Bayesian network to determine optimal strategy for Spurs' success
- Proving referee bias with Bayesian networks