This year's Premier League has been dubbed "the most unpredictable ever". With the final round of games taking place tomorrow, one of the country's most distinguished statisticians outlines his method of predicting results.
Predicting the future is difficult. People generally use their experience, judgement and their gut feelings, but sports-betting companies use subtle statistical models to come up with reasonable odds on match results.
It's nearly the end of the football season, there's buckets of data from 370 matches, so can we predict what will happen on Sunday using fairly basic mathematics? Well we've had a go, using the standard tables of goals scored for and against when playing home and away.
The basic idea is fairly simple. Take the Arsenal - Fulham game: we first try to work out how many goals we might expect Arsenal to score. On average in a Premier League game the home team scores 1.7 goals. We then move this figure up or down depending on Arsenal's attacking performance this season (good) and Fulham's defending (not so good), and we end up expecting Arsenal to score 2.2 goals.
Now even such a fine team can't score 0.2 of a goal, but fortunately we can use some classic results in probability theory known as a Poisson distribution (named after the great Professor Poisson) to tell us how to assess the probability of scoring, for example, just 1 goal, which comes out at 13%.
Then we do the same for Fulham: an away team can expect to score 1.1 goals on average, but by the time we have allowed for Fulham's attack strength when playing away (not good) and Arsenal's defence at home (good) we expect Fulham to only score 0.6 of a goal.
Cool calculation
Again Professor Poisson comes to the rescue and enables us to put probabilities on all possible goal combinations, say a 2-0 Arsenal win (15%) or a 0-1 Fulham win (0.5%). 2-0 turns out to be the most likely score, and adding up all possible goal combinations gives a 73% chance for a home win, 18% for a draw and 9% for an away win.
We know there is a strong home advantage: on average teams score 50% more goals at home than away. But a tricky issue is deciding whether to let a team's attack strength or defence weakness depend on whether they are playing at home or away.
It's all a matter of mathematics
For example, Bolton have let in a lot of goals at home this season, but when playing away their defence has been reasonable.
We've allowed for this when assessing their defence weakness for their home game on Sunday, but the analysis shows that this particular match could go any way and is very hard to predict.
Of course the model could get much more complex, allowing for recent performance and other factors. But one of the advantages of this rather cold analytic approach is that it avoids emotion - I don't support any team and so I don't let wishful thinking get in the way. Nor do I read any football pundits.
Last year we got nine out of ten final results correct, including two exact scores. But we know we were lucky. Since we base everything on probabilities, we can estimate that this year we only expect to get around six or seven of Sunday's results right, and one or two exact scores. Much more than this means we've been fortunate, but it could just as easily go the other way.
I am not going to bet anything on these matches, but it is still a gamble whether this all ends up in embarrassment.
The Prof's Predictions
Arsenal 2 Fulham 0
Aston Villa 1 Blackburn 0
Bolton 1 Birmingham 0
Burnley 0 Tottenham 2
Chelsea 4 Wigan 0
Everton 2 Portsmouth 0
Hull 0 Liverpool 1
Man U 2 Stoke 0
West Ham 1 Man C 2
Wolves 0 Sunderland 1
David Spiegelhalter is Winton Professor of the Public Understanding of Risk in the Statistical Laboratory at the University of Cambridge.
Bookmark with:
What are these?