Wednesday, 7 August 2013

The problem with predicting football results - you cannot rely on the data


Bloomberg Sports have published their predictions for the forthcoming Premiership season (****see update below for actual results) in the form of the predicted end of season table. Here are some key snippets from their press release:
The table indicates that this season will be a three horse race between Chelsea, Manchester City and Manchester United .... The Bloomberg Sports forecast expects Arsenal to claim the final Champions League place ahead of North London rivals Tottenham Hotspur.... At the bottom of the table, all three newly promoted teams are expected to face the drop...
There is just one problem with this set of 'predictions'. The final table - with very minor adjustments - essentially replicates last season's final positions.  The top seven remain the same (with the only positional changes being Chelsea and Man Utd switch positions 1 and 3, and Liverpool and Everton switch positions 6 and 7). And the bottom three are the three promoted teams so they also 'retain' their positions.

Bloomberg say they are using "mathematically-derived predictions" using "vast amounts of objective data". But herein lies the problem. As we argue in our book, relying on data alone is the classical statistical  approach to this kind of prediction. And classical statistics is great at 'predicting the past'. The problem is that we actually want to predict the future not the past!

Along with my PhD student Anthony Constantinou we have been applying Bayesian networks and related methods to the problem of football prediction for a number of years. The great thing about Bayesian networks is that they enable you to combine the standard statistical data (most obviously historical and recent match results) with subjective factors. And it is the incorporation of the subjective (expert) factors that is the key to improved prediction that 'classical' statisticians just do not seem to get.
 
This combination of data and expert judgement has enabled us to get more accurate predictions then any other published system and has even enabled us to 'beat the bookies' consistently (based on a simple betting strategy) despite the bookies' built-in profit margin. Unlike Bloomberg (and others) we have made our methods, models and results very public (a list of published papers in scholarly journals is below). In fact for the last two years Anthony has posted the predictions for all matches the day before they take place on his website pi-football. The prediction for each match is summarised as a very simple set of probabilities, namely the probability of a home win, draw and away win. Good betting opportunities occur when one of the probabilities is significantly higher than the the equivalent probability from the bookies odds.
Example: Suppose Liverpool are playing at home to Stoke. Because of the historical data the bookies would regard Liverpool as strong favourites. They would typically rate the chances of Stoke winning to be very low - say 10% (which in 'odds terms equates to '9 to 1 against'). They add their 'mark-up' and publish odds of, say, 8 to 1 against a Stoke win (which in probability terms is 1/9 or 11%). But suppose there are specific factors that lead our model to predict that the probability of a Stoke win is 20%. Then the model is saying that the bookmakers odds - even given their mark-up - have significantly underestimated the probability of a Stoke win. Although our model still only gives Stoke a 20% chance of winning it is worth placing a bet. Imagine 10 match scenarios like this. If our predictions are correct then you will win on 2 of the 10 occasions. Assuming you bet £1 each time you will end up spending £10 and getting £18 back - a very healthy 80% profit margin.
Thanks to Alex on the Spurs-list for the tip-off on the Bloomberg report.

****Update: The actual results for the 2013-14 season were very different from the Bloomberg predictions. The title was a two-horse race between Man City and Liverpool with the rest far behind. Liverpool had been predicted to come 6th and would have won the title but for a late collapse. Man Utd finished 7th. Only one of the newly promoted club (Cardiff) was relegated.

References:
  • Constantinou, A., N. E. Fenton and M. Neil (2013) "Profiting from an Inefficient Association Football Gambling Market: Prediction, Risk and Uncertainty Using Bayesian Networks". Knowledge-Based Systems. http://dx.doi.org/10.1016/j.knosys.2013.05.008
  • Constantinou, A. C. and N. E. Fenton (2013). "Determining the level of ability of football teams by dynamic ratings based on the relative discrepancies in scores between adversaries." Journal of Quantitative Analysis in Sports 9(1): 37-50. http://dx.doi.org/10.1515/jqas-2012-0036
  • Constantinou, A., N. E. Fenton and M. Neil (2012). ""pi-football: A Bayesian network model for forecasting Association Football match outcomes." Knowledge Based Systems, 36, 322-339,  http://dx.doi.org/10.1016/j.knosys.2012.07.008
  • Constantinou, A. , Fenton, N.E., "Solving the problem of inadequate scoring rules for assessing probabilistic football forecasting models", Journal of Quantitative Analysis in Sports, Vol. 8 (1), Article 1, 2012. http://dx.doi.org/10.1515/1559-0410.1418

8 comments:

  1. Predications are the base of sports betting and gambling market . Lot's of people play gambling games and betting to earn more money . It also became easy because online sports betting is safe and secure . Here it described well about the bookies and historical facts .

    ReplyDelete
  2. Hi there,
    I ran into your document on the internet, I couldn't understand much of it but what caught my interest is your believability that it is possible to predict football match results....
    I will like to know if you have  little time for me to share what I have with you...
    There is another way to predict football draws but there is no way to document it because academia isn't my strong forte....

    I would like to know if you are conversant with R or Weka or Rapid Miner?

    My method of prediction is crude and I would like to know if you and I can brainstorm on how to engage machine learning techniques to increase chances of predicting the right football match result....
    If this discussion interests you, please get back to me.

    Iamshuga@gmail.com

    ReplyDelete
  3. Enormous blog you individuals have made there, I entirely appreciate the work. payday lenders in new york city

    ReplyDelete
  4. Free Casino Games : Play Free Online Casino Games. ..with http://www.playdoit.com

    ReplyDelete
  5. You can check your math prediction formula on my new website about football/soccer prediction Math: http://www.footballlig.com
    This site is trying to deal with pure math and Statistical data analyze where you set multiples and get your prediction using real time data
    It’s not commercial and if you want to help me develop this web site you are welcome

    ReplyDelete
  6. The given information is realy very helpful. thanks a lot for such a nice information. Thanks sport betting malaysia

    ReplyDelete
  7. Thanks for this post! With betting, you are always betting against the book, not the game/team. What do you think about using predictive modeling to rapidly test various algorithms?

    ReplyDelete
  8. Seriously Excellent explanation! In the online soccer world, people just mesmerized to bet through Today's soccer predictions site. If you too want to take the pleasure then get the live soccer betting experience with Italy Serie B Fixtures and Predictions. Grab the 50% sign up bonus too.

    ReplyDelete