Ref Review 2012/13: extrapolating Important decisions

By Walter Broeckx and Tony Attwood

(Note: the gathering of data and interpretation of it is by Walter.  The commentary on sampling techniques is by Tony)

This article is part of the series of the Referee Review 2013. You can find links to earlier articles on the bottom of this article.


On this site we have published all kinds of reports over the season 2012-2013.

We have dealt with the different teams. We have looked  more closely at the refs themselves leading to the best ref election of the season according to the views and based on the numbers found by our referee reviewers.

We then had another look at the bias from the refs for and against each individual team.

The next step was to look at the four most important decisions on the football field that could have the biggest impact on the final result of games.

We have examined the wrong decisions about second yellow cards, red cards, penalties and goals.

These are the “important decisions” as we call them.  Decisions that can change games. And it is important to see how many decisions teams suffered in the games we reviewed. This will bring some extra light on the famous saying that it evens all out at the end of the season.

For some it might but shouldn’t it even out for all at the end of the season? Because if even for one team it doesn’t even out at the end… the playing field was not level. As it should be level for all!

So in an earlier article we have taken all the previous important decisions and put them in total tables. We made no difference any more for the type of decisions. Because they all are the most important decisions. Decisions that can change games and results.

That final table could be somewhat comparable with the debatable decisions website that was active for a while. But our table should be called: The Wrong Decisions Table.

Now someone said that this didn’t tell the whole picture and they are right in a way. As we couldn’t do all the games you might see some things that might have looked differently if we had done all the games.  Although it should be pointed out that no analysis ever covers every example.

Sampling is a fundamental approach to analysing all human behaviour – no one had to examine the entire white population of the United States in 2006 to establish that 22.3% of whites had blue eyes.  They did samples which took into account age, location, sex, nutrition etc and matched that to the national population.   Such surveys can be undertaken by interviewing maybe just 500 people.   To check that the result is correct a totally separate survey conducted by a totally different team using different samples could be undertaken.  If it comes up with the same answer there is a very strong chance the answer is right.

Likewise the medical authorities in England did not have to administer ibuprofen to every living person in England before deciding that products containing ibuprofen should be available to buy without prescription from a GP.   They did a wide range of animal studies, then tests on a small number of volunteers.   In short even on vital matters relating to health sampling is the process that is used.

Thus the notion that we should have examined every game in order to get any insight into referee decisions goes against all the normal procedures of estimating human behaviour.   It is suggested often by disbelievers on this site, but it has no place in sampling technique.  If we only looked at one match per referee the argument could be put that the sample was not big enough, but given the level of sampling that we have undertaken, this does not apply.

What we can do is take the trend from the matches we have done and multiplied that with 38 games and then divided that by the number of games we actually reviewed.

But for the benefit of the readers who still feel that no sample is good enough unless it covers a huge number of examples, in the table I am about to show you I have highlighted the teams that we didn’t do 50% of their games in yellow. In any other survey 50% would be considered more than enough to get a fair resultant figure, but I don’t want to talk about those teams as we don’t want to be distracted by talk about sampling.  I put them in to be complete, and because most people will find that acceptable, but for those who worry then are not considered in the overall total.

So remember before reading this that this is an extrapolation exercise and is not based on actual numbers from reviews.  It is the table if we would extrapolate the numbers we have found in our reviews to 38 games for each team:

extrapolated 2013


We do see that Arsenal is still in top position when it comes to suffering from important decisions. That is the only number that has stayed unchanged from the previous articles as we did all Arsenal matches in our review last season.

What does change is that both clubs from Liverpool also seem to suffer a lot. Again one has to be careful about it but they surely weren’t on the good side of things.

Manchester City only had a small amount of wrong decisions going against them.

Chelsea on the other hand was on the positive side of things if we look at the numbers like we did in this article.

Tottenham is a team that could have enjoyed 9 important decisions if we look at the numbers over 38 matches.  And the same could be said of Sunderland, although if you are concerned about numbers you might like to note that we did  50% of their games.

If we would have done all 38 matches from Manchester United we might have seen them being given 18 big and important decisions in their favour.

From this we can see that there is a difference of 38 important decisions between Arsenal and Manchester United. And that would be one possible defining moment each game of the season between those clubs.   38 decisions difference in the important decisions over 38 matches in a season. Now surely that would have a big impact on the points total of any team.


In this series

Wrong second yellow cards

Wrong red card decisions

Wrong PENALTY decisions, a closer look.

Wrong goal decisions

It doesn’t all even out in the end

The earlier series of reviews:


7 Replies to “Ref Review 2012/13: extrapolating Important decisions”

  1. I’m glad you published this as was a bit tardy in getting around to commenting on the previous article… that you needed to divide the score by the number of games reviewed to get a comparable score for each team.

    And you focused on exactly the key conclusion. On average MU got net one important (erroneous) decision in their favour as opposed to Arsenal every single game. Even if the Untold numbers are slightly biased in Arsenal’s favour it is still likely that MU in effect, last season, had a colossal head start.

    As MU’s positive bias seems – anecdotally – to have declined this season that comparison won’t be so bad – but Arsenal still seem (anecdotally) to be getting the thin end of things as much as ever.

    A real pity that Untold no longer have the resources to continue this effort. I really hope someone else can pick up the baton because the only way to prove bias, in the absence of a smoking gun, is by deep statistical analysis where bias can be proved beyond reasonable doubt.

    If the PGMOL’s stats are correct (90%+ decisions correct and – obviously – no bias…) then this kind of outcome is many standard deviations away from what should be expected. To all intents and purposes this is “impossible”. So something is wrong. Incontrovertibly!

    Unfortunately, a large proportion of the population, including many who are otherwise educated, have a very limited education in statistics and probability and just don’t understand this.

  2. Great work Tony & Walter. As a retired psychologist who worked continuously with sample populations, standard deviations, means,modes and statistical variations used to measure human behaviour and predict possible outcomes,it is important to remind our UA readers that statistics are predictive NOT certainties. The very nature of statistical sampling across a very large or small population/sample (the EPL is a small population with 22 teams playing 38 league games = 176 referee performances)changes the accuracy and usability of the results. what I mean here is that sampling 17 referees’ performances over a 38 game season does not require anyway near what Walter and Tony have done, to produce fairly accurate and highly predictive results. Most psychological testing requires thousands of members in a sample or population and the standard deviation and variance (how much one result varies from another)must still be taken into account when using the subsequent results.
    Therefore, reviewing every referee’s performance would NOt give us any more accurate a picture than doing what Walter & Tony have done. the principle here is that looking at 30-40-50-60% of the games will give more or less the same results.

  3. I guess that even if the sceptics are forced to concede the results you present for 2013 are valid they would then argue that it is only one year and thus are not of any value as the sample rate is only one. They would say yes OK but things would even out if you were to look at the last 10 years say.

  4. re: City game.

    Ignoring the bizarre and rare spectacle of the linesmen getting so many calls wrong in the same game against the same opponent (must be a coincidence!) we can state with confidence that the extra days rest gave Abu Dhabi a clear advantage.
    Two vs. three days rest is a crucial difference, this data has been available for a while. It is why the German FA are bulding their own complex in Brazil so their players don’t have to faff about travelling between different complexes in-between games every three days. They want their players to rest and recover in those three days, as much as is possible:

    “research by the former Wales assistant manager Raymond Verheijen. Last year he analysed 27,000 matches – from seven top-flight European leagues, the Champions League and Europa League – and found teams playing after only two days’ recovery against teams who had enjoyed at least a three-day gap were 42% less likely to win. He has called for a three-day gap between matches mandatory in all fixture scheduling”

Comments are closed.