Friday 4 May 2018

Anthony Constantinou's football prediction system wins second spot in international competition


Anthony Constantinou
QMUL lecturer Dr Anthony Constantinou of the RIM research group has come second in an international competition to produce the most accurate football prediction system. Moreover, the winners (whose predictive accuracy was only very marginally better) actually based their model on the previously published pi-ratings system of Constantinou and Fenton.





Anthony's model Dolores was developed for the International Machine Learning for Soccer Competition hosted by the Machine Learning journal.

All participants were provided with the results of matches from 52 different leagues around the world - with some missing data as part of the challenge. They had to produce a single model before the end of March 2017 that would be tested on its accuracy of predicting 206 future match outcomes from 26 different leagues, played from March 31 to April 9 in 2017.

Dolores was ranked 2nd with a predictive accuracy almost the same as the top ranked system (there was less than 1% error rate difference between the two; the error rate was nearly 120% lower than the participants ranked lowest among those that passed the basic criteria).

Dolores is  designed to predict football match outcomes in one country by observing football matches in multiple other countries.It is based on a) dynamic ratings and b) Hybrid Bayesian Networks.

Unlike past academic literature which tends to focus on a single league or tournament, Dolores provides empirical proof that a model can make a good prediction for a match outcome between teams 𝑥 and 𝑦 even when the prediction is derived from historical match data that neither 𝑥 nor 𝑦 participated in. This implies that we can still predict, for example, the outcome of English Premier League matches, based on training data from Japan, New Zealand, Mexico, South Africa, Russia, and other countries in addition to data from the English Premier league.

The Machine Learning journal has published the descriptions of the highest ranked systems in its latest issue published online today. The full reference for Anthony's paper is:

Constantinou, A. (2018). Dolores: A model that predicts football match outcomes from all over the world. Machine Learning, 1-27, DOI: https://doi.org/10.1007/s10994-018-5703-7

The full published version can be viewed (for free) at https://rdcu.be/Nntp. An open access pre-publication version (pdf format) is available for download here.

This work was partly supported by the European Research Council (ERC), research project ERC-2013-AdG339182-BAYES_KNOWLEDGE
The DOLORES Hybrid Bayesian Network was built and run using the AgenaRisk software.

The full reference for the pi-ratings model (used by the competition's winning team) is:
Constantinou, A. C. & Fenton, N. E. (2013). Determining the level of ability of football teams by dynamic ratings based on the relative discrepancies in scores between adversaries. Journal of Quantitative Analysis in Sports. Vol. 9, Iss. 1, 37–50. DOI: http://dx.doi.org/10.1515/jqas-2012-0036
Open access version here.
See also:

1 comment:

  1. As part of the challenge, all competitors received match results from 52 different leagues throughout the world, some of which had incomplete information. Before the end of March 2017, they had to create a single model that would be evaluated on its accuracy in forecasting the results of 206 future matches from 26 different leagues that will be played from March 31 to April 9, 2017.

    ReplyDelete