How Opta got this season so totally and utterly wrong: the statistics

By Tony Attwood

Now we all make mistakes, and there’s not much point in pointing out the errors of those who have used a pen and paper, with maybe a calculator, to work out what will happen at the end of the season.  (That’s what we did, and we predicted Arsenal in third place).

But there is plenty of good reason to have a bit of fun at the expense of those who claim they have used a supercomputer to work out the results, because they haven’t   As we have often noted, supercomputers are far too expensive, their time is far too highly in demand.  The Titan machine for example cost $97 million.

So I thought it might be interesting to see just what an alleged “supercomputer” predicted at the start of the season, and quote Opta’s exact words on Twitter, which are that the supercomputer “can’t see past another close battle between Liverpool and Manchester City in 2022-23, with Jürgen Klopp’s Reds edging out Pep Guardiola’s side for the title.”

Well, looking at the table at the moment it does seem to me one can see past Liverpool and Manchester City what with Arsenal being 19 points above Liverpool having scored 13 goals more and conceded 18 goals less.

 

Team P W D L F A GD Pts
1 Arsenal 23 17 3 3 51 23 28 54
2 Manchester City 24 16 4 4 60 24 36 52
8 Liverpool 22 10 5 7 38 28 10 35

 

Arsenal’s chances of the title were given as 0.13% (ie just over one tenth of one percent) of those of Liverpool.

Now this is worrying since supercomputers are used to defend the nation from attacks by enemies known and unknown.  If they can get this so wrong, we may ask, are we safe?

The four clubs most likely to secure Champions League football according to the mythical “supercomputer” (or “my mate Dave with a pencil and a calculator whose batteries are runnng flat” to give it a more accurate description) gave the top four of Manchester City, Liverpool, Tottenham and Chelsea.

And they went on “The three outside bets for a top four finish based on the supercomputer’s analysis are Newcastle United, West Ham United and Leicester City.”

So to put that in perspective, Arsenal would not only fail to get into the top four, but would finish lower than West Ham and Leicester City.   Again let’s have a quick look

 

Team P W D L F A GD Pts
1 Arsenal 23 17 3 3 51 23 28 54
14 Leicester City 23 7 3 13 36 41 -5 24
18 West Ham United 23 5 5 13 19 29 -10 20

 

As for the foot of the table, Opta’s mythical supercomputer analysed who was going down.  At the end of the season.   They went for Bournemouth having “the highest chance of relegation (45.0%), just ahead of Nottingham Forest (44.5%) and last season’s Championship title-winners Fulham (43.8%).”

And then they added, “History suggests it might be possible for one of those sides to stay up, however.”

So let’s look down the table

 

Team P W D L F A GD Pts
6 Fulham 24 11 5 8 35 30 5 38
13 Nottingham Forest 23 6 7 10 18 38 -20 25
17 AFC Bournemouth 23 5 6 12 21 44 -23 21
18 West Ham United 23 5 5 13 19 29 -10 20
19 Leeds United 23 4 7 12 28 39 -11 19
20 Southampton 23 5 3 15 19 40 -21 18

 

Not one of the Opta supercomputer relegation predictions looks like being certain, although Bournemouth are only just above relegation at the moment.  But as for the notion that fans of Brentford “should be nervous of the campaign ahead” the only reply can be, Opta’s alleged supercomputer needs some new medication.

Indeed, if you are given some sort of prediction for a football outcome and it mentions Opta it might be best not to take any notice.  If you want to look at our original preview of this season in which we suggested third was well within the club’s reach – as opposed to sixth suggested by Opta – it is here. 

Here’s Opta’s prediction for the top, with a note as to how far off they are at the moment. Some of the bigger cock-ups on the prediction front are in bold.

  1. Liverpool – currently 8th, 19 points behind Arsenal 
  2. Manchester City – currently 2nd, two points behind Arsenal
  3. Tottenham Hotspur – currently 4th, 12 points behind Arsenal 
  4. Chelsea – currently 10th – 23 points behind Arsenal
  5. Manchester United – currently 3rd, five points behind Arsenal 
  6. Arsenal – currently top 
  7. West Ham United – currently 18th, 15 points behind their predicted position
  8. Newcastle United – currently 5th, 13 points behind Arsenal
  9. Leicester City – currently 14th, 11 points behind their predicted position
  10. Aston Villa – currently 11th – Opta almost getting this right
  11. Brighton & Hove Albion – currently 7th, seven points above Opta’s predicted place
  12. Crystal Palace – currently 12th.   OPTA PREDICTION CURRENTLY CORRECT
  13. Wolverhampton Wanderers – currently 15th
  14. Everton – currently 16th
  15. Leeds United – currently 19th 
  16. Brentford – currently 9th – 14 points above their predicted position
  17. Southampton – currently 20th
  18. Fulham – currently 6th, 20 points above their predicted position
  19. Nottingham Forest – currently 13th 
  20. Bournemouth – currently 17th

So just one prediction right at the moment – and in on place they are 23 points out.

And remember all this is supposedly using Opta’s supercomputer.  I think the best thing to do with Opta statistics is look at them with a certain amount of caution (if not derision).

Meanwhile, here is a video…

And the funny thing is, there was more to come!

 

14 Replies to “How Opta got this season so totally and utterly wrong: the statistics”

  1. Tony

    First thing I would ask is, is the OPTA supercomputer prediction the same one picked up and published by all the main stream media at the beggining of the season, as was published in the SUN? I hope so as I’m going to use the one I found from the Sun for my comparisons.

    https://www.thesun.co.uk/sport/19376214/premier-league-supercomputer-man-utd/

    If so, by and large I understand it, at at least most of it.

    Again, as we all know, all computers do as analyse raw data. They don’t do emotion, conjecture, or subjectivity. It’s all about the raw data. And what is their primary source of raw data? Last seasons table. How the teams performed last season. And that’s it basically.

    As such it comes as absolutely no surprise to me that the ‘supercomputers’ prediction for 2022/23 is very similar to the way they finished last season. In fact the top 6 is the exact same 6 teams in the exact same order. Not original but it’s how computers work. Essentially what happened last time is likely to be what happens this time.

    That’s how computers work.

    The relegation prediction is also a simple assumption based on past seasons, namely those that come up tend to struggle, hence a rudimentary prediction of relegation for all. Lets be honest, without any previous form against the rest of the teams to judge them on it’s a fair assumption. In fact, as far as a data driven computer analysis it’s a prediction it is bound to make.

    And just look at this for more proof of this season to season transposition that the computer cant avoid doing for positions 15, 16 and 17.

    Last season it finished thus:

    15th Southampton
    16th Everton
    17th Leeds

    Computer prediction:

    15th Everton
    16th Leeds
    17th Southampton

    Current positions:

    16th Everton
    19th Leeds
    20th Southampton

    That’s pretty accurate in all honesty. All that’s happened is that the promoted teams are doing better than expected, but what was expected was what most people, let alone computers, would expect.

    The question is really, should computers be better than that?

    For example, should the computer of seen the massive improvement in Arsenals form over the second two thirds of the season?

    Should it of taken into account the injuries and COVID issues that contributed to our poor start to the season?

    The bedding in of new players?

    The injuries that contributed to our poor run in?

    Does it consider how dependent Liverpool are on Van Dyke and what may happen if he gets injured?

    Should a computer be able to use all these nuances that we know about? Nuances we know about because it’s OUR club?

    I’m not sure.

    All a computer does is look at the raw data from last season, basically the final table, and predict how it’s going to finish this season. No more, no less.

    Which is why it’s a load of b****cks

  2. @Nitram,

    considering that it is proven that IA is sexist and racist – because it has been programmed mainly by young celibate white males… my guess is that the guys programming the AI of the supercomputer to predict are fans of some specific teams…

    Anyway, we’re used to that kind of crap and it is fun to see all of them being shown utterly wrong.

    All the while I’m enjoying the games….

  3. Chris

    The interesting part for me is when people start to ‘analyse’ what the computer has supposedly predicted, because that’s where the subjectivity comes in, Did I say subjectivity, I meant stupidity. such as:

    “…..Opta’s exact words on Twitter, which are that the supercomputer “can’t see past another close battle between Liverpool and Manchester City in 2022-23, with Jürgen Klopp’s Reds edging out Pep Guardiola’s side for the title.”

    Okay so far, subjectively, given everything we saw last season that is a fair subjective, as well as data driven point of view. But it’s after that, the predictions for the rest of the top 4 and beyond, that it becomes more enlightening as to what people are thinking, or rather hoping

    For example:

    “And they (Opta) went on “The three outside bets for a top four finish based on the supercomputer’s analysis are Newcastle United, West Ham United and Leicester City.”

    How, under any amount of logic can they ‘base’ that conclusion on the ‘Supercomputers’ analysis?

    The supercomputer predicted Leicester in 9th, West Ham in 10th and Newcastle in 13th. How does that equate to them being the best ‘outside’ bets for the top 4?

    Surely, Arsenal and Man Utd were the best outside bets for a top 4 finish ‘based’ on what the computer said?

    That is more like some fool at Optas wish list, rather than anything based in reason or logic, computer generated or otherwise.

    Yes the computer is out, well out in certain cases, but as I say, basing it’s analysis on last season, which is all a computer will do, it’s predictions are, well, predictable!

    As I say the ridiculous part is when there ‘humans’ start to give us their ‘opinions’.

    The fact they overlooked Arsenal when there was so much data to see, all collated and reproduced here on Untold many many times, is a particular concern, especially when it’s actually their job to see these sorts of things.

    Untold predicted 3rd. I predicted top 2 as a possibility.

    How is it that a company who’s job it is to see these things, could have it’s ‘annalists’ predict Arsenal finishing, not only out of the top 4 but possibly as low as 8th, when everything in the data, if looked at properly, told us something completely different?

    Could it be it was more to do with what they wanted to see rather what they actually could see?

    Either way it’s a poor show on Optas part.

  4. @Nitram,

    combine bias and just re-reading the past season and you’ve got your result.
    Wothless and just done to get clicks. Guess it pays or they’d stop doing it.
    So let’s use them for stats of past games and forget the rest.

  5. fascinating to see the guardian all worked up about potential corruption of spanish referees but incapable of considering just basic human incompentence in PL refereeing….

  6. There’s a whole lot of ‘Slamming’ going on.

    I’m so glad we’re doing so well otherwise every man and his dog would be having a go at us. Oh, wait a minute, they still are.

    The Mirror are relentless in their anti Arsenal reporting, but they are not the only ones. Talkshite are still the same talkshite I stopped listening to years ago, and most other ‘outlets’ never give it a rest either.

    The Mirror 4/1/23

    Alan Shearer slams Mikel Arteta after Arsenal draw

    The Mirror 22/1/23

    Gary Neville slams Mikel Arteta as he weighs in on debate

    The Mirror 20/2/23

    Mikel Arteta slammed for “taking the p***” after Arsenal boss caught mocking referee

    The Mirror just about every day

    Mikel Arteta has been branded “deluded and disrespectful” by talkSPORT presenter Natalie Sawyer.

    Artetas not the only one getting it in the neck:

    The Mirror 18/2/23

    Gabby Agbonlahor tears into “rubbish” Gabriel Martinelli for being ‘disrespectful’

    And even winning with 2 late goals doesn’t prevent the entire team from being ‘slammed’

    “They’re going to pay a heavy price” – Tony Cascarino slammed Arsenal for their performance in their 4-2 win against Aston Villa in the Premier League on Saturday (February 18)”.

    And this is from the bunch t0$$ers who had us finishing down in 5th, 6th or worse.

  7. @Nitram,

    well I’ve not yet seen these deadwood football players slamming Guardiola who, wiat, right….is doing worse then Arteta….

  8. Those dreaded supercomputers are at it again.

    This time, rather than look into the whys and the wherefores of what has been predicted I thought I’d try to delve a little into the World of Supercomputers and get an idea if they do actually get used for this sort of thing, and it has to be said, from what I have found they do.

    Apparently there are around 500 supercomputers in the World, mostly in America, India, China etc, as you would expect. And mostly they are used in the military and financial sectors, again as you would expect.

    But it would seem there are pretty powerful computers used in the gambling industry as well. The one that tends to be cited most often, for the World cup and leagues around the World, is a computer owned by MyBettingSites, which is the one that has been cited in the latest ‘supercomputer’ predictions doing the rounds.

    Now whether this computer is anything like as powerful as the military and financial ones is another matter, but it certainly seems to be classed as a ‘supercomputer’.

    Either way it is still garbage at it and usually tells us no more than my 10 year old niece could. In fact, the day it starts saying Arsenal could win the Premier League is the day I start to worry.

  9. BBC in particular has re-ignited its long-term love affair with Man Utd. Apparently they are now hard on the heels of local rivals Man City in the race to be EPL champions, ready to take City’s crown.

    ETH is of course the most outstanding manager in the world and MR the best player.

  10. Computers, whether super or not share one characteristic – “Garbage in, garbage out”.

    I used to configure/assemble/maintain supercomputers

  11. seismic

    Maybe you can tell us. When does a computer becomes classified as a ‘super’ computer and not just a f**k off big computer??

  12. The performance of a supercomputer is measured in floating-point operations per second (FLOPS)and today supercomputers operate at over 100 quadrillion terra FLOPS a second. A regular computer has performance up to tens of teraFLOPS per second (MIPS).

  13. Tony

    Thanks for that.

    None the wiser really, safe to say, there area lot of them around, and it does seem one, a betting company one, is used to predict certain outcomes in the World of football. It is also safe to say they are not great at it, unless what happened last year happens again this year, which seems to be the limit of their ability.

    As siesmic says, garbage in garbage out.

    In truth it’s probably not the garbage they put in that is the biggest problem, it’s more likely what they DON’T put in.

    Just taking Arsenal in isolation, as I said early, does anyone think for one moment the computer was inputted with info about our early season COVID issues? New players? new tactics? That our youngsters would be a year older, wiser, better? That our late season collapse was partly down to these youngsters inexperience, aligned to a lot of injuries? That our new signings would settle so well, so quickly and be so good?

    There isn’t a hope in hell any of that would of been loaded into the computer.

    Then multiply all those nuances by 20 for each team.

    That is why it’s bulls**t

  14. The minimum definition of a supercomputer seems to change every year with the ongoing rapid improvements and developments in hardware. The last one I installed (over 20 years ago) consisted of 16 full-height racks, each containing 32 computers, which each contained 2 CPUs, networking, memory and hard disks. This had a total of 1026 CPUs, and was probably in the world’s top 100 at the time of installation, but would be an antique curiosity now.

    Since 2000, the size of the individual computers (blades) has decreased, the number of CPUs per blade may have increased, and the number of cores per CPU has risen from 1 to 64. Networking speeds have increased, and storage speeds have increased dramatically with the introduction of SSDs.

    I should think that there are currently over 20,000 working supercomputers around the world, and I’m not even certain that the top 500 list is definitive. I seem to remember that some organisations asked to be excluded from the list.

    The fastest supercomputer today is probably able to complete a given mathematical task 1 million times faster than the machine I installed.

    Some of these machines are used for cloud-computing, and customers are able to lease time on a sub-partition of one of these machines. They may want to use 16 cores to do some number-crunching for an hour or two. When these newspapers talk about supercomputers, what they almost certainly mean is that they have leased some time on a “cloud” machine. The important thing to remember here is that a computer only does what it is told to, and if the algorithms used are faulty (which I think is highly likely), or the quality of the data supplied to the algorithms is poor (which I also think is highly likely). This is important because these media people never question what they are seeing in real life on a daily basis. Why would they ever question the quality of their data or the algorithms used to process that data?

    I’m still not sure why they would need a supercomputer to make their predictions. You can buy very powerful desktop systems on the internet. Mine is getting on now but will theoretically run to almost 200GFLOPS (20% of the speed of that 16-rack system I installed over 20 years ago).

    A supercomputer is defined as any computer that won’t fit into a double-garage.

Leave a Reply

Your email address will not be published. Required fields are marked *