REF REVIEW 2012 : Are we biased because Arsenal is included in the numbers?

—————————————————

This article is part of the series : REFEREE REVIEW 2012

—————————————————-

 

By Walter Broeckx

Every now and then we have received some comments from people about our numbers. I don’t refer to the just abusing comments but people who tried to make a point. And one of the points was and I try to write it down as it has been said a few times:

“As this is an Arsenal site I find it difficult to believe your numbers because the numbers include Arsenal games and therefore the bias from you referee reviewers is shown in the numbers”.

And this was not only said by supporters from other teams but also by Arsenal supporters. So I can say that it was some kind of concern about our numbers. A concern I can understand completely. Of course I know that I and my fellow referee reviewers have done each game with the will to be as impartial as can be. The fact that our reviews are open to see and to consult adds to that of course. Nobody wants to look like a fool so we will not write down a bunch on nonsense because at some point in time it would backfire on us.

Of course there can be disagreement on some decisions we made. I remember Arsenal fans not agreeing with us when we didn’t gave a call to Arsenal because with the rules in hand we disagreed with the “general opinion”. And we have had supporters from other teams who disagreed with a decisions left or right about a game in which their team was involved.

Of course part of this concern could have been lifted by referees who support other teams if they would have joined us. Despite a few appeals, nobody really came to us and offered to help us.

So how can we show you that in fact the Arsenal games where not conclusive in this report?  Well this is what I will try to do from now on in this article.

First of all we have based our final report on 155 games. Now if we would completely erase all the Arsenal games we would have done 117 games.  And that would mean that we would have done 117 games on a total of 342 games (380-38 Arsenal games) and that makes a total of 34.21% of all the PL games. Not as much as the 40% we did in total but still more than 1 game in 3.

We will now give you two tables. An un-weighted and a weighted table. And in this table we will compare the % of correct decisions. The first number you will see is the one with all the games including the Arsenal games. The second number is the one with all the games but with the Arsenal games erased from the database.  Next to that we have made difference between both numbers and included the numbers of games on which this is based.  Better see for your self now and we start with the un-weighted % of correct decisions.

% correct decisions Included Arsenal Not included Arsenal Difference Nr Games
SWANSEA 81,730 81,930 0,200 6
AVI 79,110 74,680 -4,430 4
WBA 76,170 74,44 -1,730 4
BLACKBURN 76,080 74,830 -1,250 8
BOLTON 75,430 77,130 1,700 8
WIGAN 75,280 76,890 1,610 6
MC 74,750 74,900 0,150 30
LIVERPOOL 74,020 75,320 1,300 18
CHELSEA 73,320 73,170 -0,150 30
TOTTENHAM 72,960 72,820 -0,140 19
MU 72,800 72,580 -0,220 30
EVERTON 72,460 74,220 1,760 10
NEWCASTLE 72,290 72,410 0,120 14
NORWICH 72,100 72,470 0,370 9
SUNDERLAND 70,930 71,510 0,580 12
ARS 69,800
QPR 68,550 71,270 2,720 4
WOLVES 68,240 70,290 2,050 6
FULHAM 65,450 71,790 6,340 8
STOKE 64,480 66,300 1,820 6
Average difference on 19 teams 0,674

And if you now look at the teams you will see that in most cases there is a difference. And of course it is only normal that there is a difference if you remove part of the database.  But 8 of the differences are less than 1%. Then we have another 7 teams with a difference between 1% and 2%. So in total we have 15 teams (out of 19 as Arsenal doesn’t count) who have a difference in the numbers of less than 2%.  That is almost 79% of the teams falling in to this category.

4 Teams have a bigger difference. But of course as we have shown before when a ref puts bias in a game it is obvious that there can be big changes if we remove that game. The biggest swing is found in Fulham but in one of those games we had ref Probert who got the lowest score of any ref in the season if I remember correctly. So it is obvious that this has a great impact on the numbers from Fulham. For Aston Villa we see a reverse situation as in the Arsenal game the ref did a great job and by removing this the number for Aston Villa goes down.

But the most important line is the last line we put in bold and I will repeat it for you.

Average difference on 19 teams: 0,674%

Because with or without the Arsenal games there is only a difference of 0.674% in total on the final outcome of the % of correct decisions.

Yes there is a difference but it is nothing more than that. In general there is less than a difference of 1%.

But for those who have been following us over ref and whistle in the past months or years you know that when there are un-weighted numbers we also have weighted numbers. So wait a second and find out those numbers also and the table has the same build up as the previous one.

SWANSEA 81,630 80,13 -1,500 6
AVILLA 79,810 75,14 -4,670 4
WBA 75,910 73,73 -2,180 4
BOLTON 75,180 81,12 5,940 8
WIGAN 75,100 71,98 -3,120 6
BLACKBURN 74,670 73,21 -1,460 8
Man C 73,720 74,21 0,490 30
Man U 72,580 70,42 -2,160 30
NEWCASTLE 72,290 72,86 0,570 14
CHELSEA 71,700 71,27 -0,430 30
TOTTENHAM 71,510 71,05 -0,460 19
LIVERPOOL 71,500 73,18 1,680 18
EVERTON 71,350 73,64 2,290 10
NORWICH 71,250 72,18 0,930 9
SUNDERLAND 69,980 70,54 0,560 12
ARSENAL 68,640
WOLVES 66,970 69,25 2,280 6
QPR 66,300 72,73 6,430 4
FULHAM 63,640 68,52 4,880 8
STOKE 61,660 61,14 -0,520 6
Average difference on 19 teams 0,503

As usual the fluctuations when we put weight on the decisions is bigger. This is of course because in the first table each decision only counts for a 0 or a 1. But in the weighted table it can be 0, 1, 2 or 3. Hence the possible bigger difference in some numbers.

We now have 7 teams with a difference less than 1%. And another 3 teams with a difference lower than 2%. Again the explanation that has been given after the un-weighted table can be applied here.  When a ref had a rubbish game this even gets a bigger effect in these numbers when we delete those games.

And yet again I want to draw your eyes to the most important number in this table:

Average difference on 19 teams : 0,503%.


So even when we put weight in the decisions we only see a difference in total between the table that includes the Arsenal games and the one without the Arsenal games of just 0.5%.

So hiding yourselves behind the fact that as this is an Arsenal site and that it is down to our bias that has given the results we have found looks a bit easy.

And if you really want to see that number of not even 1% as proof of us having some kind of bias, be my guest. You can take each number from now one and look back at them and add 1% to it. Or take 1% away from it. If you really want to do this you can take the numbers that way.  It will slightly change the final number. But it doesn’t change the final outcome of our reviews.

What was wrong, stays wrong. What was good, stays good.

And the one things that comes out of this comparison is that if there was bias it was such a small bias that we can be proud of the way our referee reviewers have done their job.

I dare even say categorically that this slight difference means that even in the Arsenal games our referee reviewers have done all what is humanly possible to keep their bias away from the reviews.

And I just want to ask you to give this thought some consideration:  just imagine that our referee reviewers did their job in an almost non-human perfect way and that our numbers FOR ALL TEAMS were correct. And if you still are not convinced about the job our referee reviewers did I can only challenge you to come up with your own reviews. I can only tell you to challenge our reviews as they are out there in the open.

And if you don’t want to do this. Or if you are not able to do this. Then the only option you have is to accept that what we have done is something that nobody has done before. Like it or hate it. But that is a fact.

Next in our series we will give you all the different league tables with all the teams in it. In this articles you will be able to see who done good and who seen worse by the refs in one blink of an eye.

Editorial footnote: As this article goes live, we are working on a scheme that will bring in referees who support other clubs to join us, so that we can have more refs and review more games.  The discussions involve bringing in another organisation, and at this point we’ve no idea if we can make this work or how much it will cost us, but we are trying.  We’ll keep you informed.

———————-

Who invented away support?

———————

37 Replies to “REF REVIEW 2012 : Are we biased because Arsenal is included in the numbers?”

  1. Well done.

    I’m an Arsenal supporter who did express concern that the fall-back position for critics would be ‘but you are Arsenal supporters’. This is a great response.

    The Wikipedia entry on PGMOL has stayed there (it was removed once as you are aware), might be worth adding to Mike Riley’s as well.

  2. This is excellent and shows that the over all data is very sound. I hope you put this together as book to be published. It can be how the refs affected the 2011/2012 season! They are affecting outcome by being poor, having favorites and teams they love to screw over.

  3. Strong response, thank you. However you will always face this issue and if you are serious about getting widespread readership and take up by the media then these ref reviews must be posted on an independent non Arsenal site (of course you can still link to them from here).

    A couple of other suggestions:
    What are other league tolerance levels for competence? If you have 70% (Belgium) then biased refs know they can work within this number. I couldn’t imagine that the German league for example would tolerate a 30% “mistakes” level.

    Similarly you should do a survey of the main leagues in Europe over number of refs and how many games they can cover for a team. It is amazing that Dean for example did 6 Arsenal games last season. This is a potential “easy win” for you if you can prove that the EPL is way out of line on this matter. You then have a chance to get pressure on PGMOL to change (start a media campaign, etc). Another good comparison would be over the same numbers of refs and game coverage for the EPL in the pre and post Riley periods.

  4. Walter,
    Please, clarify for us the average figures for PL (goals, offsides, cards and so on). Did you use Arsenal and Stoke numbers in calculating the averages?

  5. Gary,
    I’m not a statistician, but I’m concerned by your comment and its possible direction and its possible impact on people like me who don’t know and find statistics – in this culture – a bit intimidating. Now, as I understand it, there are different measures of average (mean, median and mode), so it’s fair to ask what average means. But I also understand that to represent reality as it more likely is, that there is no iron-clad rule that one must exclude so-called statistical outliers (if that’s where you’re going) to represent what is “real”. In other words, there are truths in the extremes (outliers) that may shed enormous light on the whole. Plus a strong case can readily be made, especially in the present context here, that to exclude Arsenal and Stoke (or whoever is at either end of the results) would actually undermine the study but, in doing so, misrepresent reality.

    If that’s your implication, what makes them outliers? What does an outlier do in your view in this study? Is it a purely formal definition? Does it insist that the middle is the truth and the extremes are not? Especially, when the power of this study itself – of a small group of teams as opposed to say, the London phone book – is that it demonstrates (1) that yes, there ARE extremes in their treatment by 17 referees; and (2) that there must be a reason – to be further investigated – for these extremes to exist.

    For anyone to knee-jerk exclude such extremes to uphold the formal validity of a Bell Curve (that is best suited for studies of very large samples) is to enshrine the method (or some idealized model of “proper inquiry”) and lose (or exclude) the very reality that it purports to shed light on. As I say, I’m no statistician, and I don’t know if I’m right, but this is what I fear and what I think. Perhaps we’re on the same page altogether. So Please clarify/correct me/this if you feel otherwise, and say why. In sum, where are you going with your question? and please help educate me/other similarly challenged readers with your further thoughts.

  6. p.s. or perhaps this study is subject of your term paper to be and you need to clarify the author’s approach? 🙂

  7. I think that the criticism of your figures are due to the lack of a control group alongside not accepting that there is always phenomenalogical bias in life.
    Subjectivity occurs as soon as our senses become involved and the only time reality is experienced is when we’re dead. We view the world as we are not as it is and it is so difficult ‘bracket’ aside our bias in order to see a level of objectivity so all we can do is accept that it exists in order to lessen its impact. To deny bias existing in your world means that there will be an increase in the effect of the implicit bias into the explict results.

  8. Guys, Why do we keep asking Walter and his small team of extremely good reviewers to forever do more without offering to help them in any way? There is so much they can do without help. They are not getting paid for this you know. To give up your time and effort gratis for the benefit of people who largely remain unappreciative is a huge deal.

    Walter & Co, I thank you for all the work you have done and all the hours you have put in. My hope is that all these annomallies that are being swept under the carpet blow up in a nasty way in the faces of the FA, PGMOL, the Media and most importantly Sky who seem to help encourage it either directly or indirectly; same as it blew up in Italy.

  9. I am surprised by this result. I had assumed that the bias against The Arsenal was so great that with those matches excluded the bias figures would show a large improvement.

    Despite my (obviously) wrong auumption, however, this is a great answer to the pro-AFC bias question raised.

    Great job with this and (again) great job with the reviews.

  10. @Bob’s reply to Gary:
    Nice reasoning Bob.

    Either the Stoke and Arsenal data are NOT statistical outliers: in which case the assumption drawn from their inclusions can be considered likely to be valid;

    Or they ARE outliers. Their exclusion from the general population would, in this case, remove any skew that they cause in the assessment of those data…BUT…that still leaves them as outliers.
    This would beg the question of why are they outliers.

    Either way it is still a-can-of-worms-time.

  11. Matt, Bob

    This is a point I mentioned previously and Walter confirmed the data back then did include Arsenal so in effect when comparing Arsenal vs rest of the league, that rest of the league included ourselves. I think the data was regarding home or away bias and included Arsenal as the home or away team when comparing the rest to Arsenal.

    Walter,
    With regard to next season and possible funding, will you be accepting donations from avid readers? I’m sure many of your readership would be more than happy to make a small donation to cover basic expenses and even offer the refs a small incentive to take part which would hopefully attract more refs.

    I wonder how many unique visitors there are and how many of them would want to donate. I would.

  12. Walter/Dogface, All:
    So far, has anyone come across even one mention of this ongoing study – now several weeks into its roll out – in any blog or mainstream media? Even one? Hmmmm…..? Anyone in the media in favor of the free marketplace of ideas? Where are the so-called and alleged (by some readers hereabouts) fair brokers at the Manchester Guardian football department? Then again, IF the media counter-attack comes, it will signify that this project has made serious headway (to say the obvious).

  13. If I have 20 random numbers between 80 and 65 and take out one number the effect on the average will always be very small. There will never be a great deviation so any seemingly small deviation can equate to a large difference.
    If you have 19 scores of 80 and one of 70 the average is 79.5. Take away the anomaly and you get a average of 80 or a 0.6% deviation. 0.6 seems tiny yet the difference is 10 points.
    Or am I reading the figures wrong?

    PS I do believe there is bias

  14. while i agree with you fred and that is a valid question, i have to say that you could use that argument to dispute a lot of studies. maybe if it was possible to translate the % into scores, there could be a more clear idea of the difference but of course, human error will be always present no matter what is done and i do not wish to ask more (for the moment) to the team that did this voluntarily taking from their free time.
    this study is a start and maybe its not completely accurate, nevertheless it gives one a good idea of how things are if anything else

  15. Excellent response Walter and here are a few suggestions based on statistical theory that might augment your research:

    1)Can you have someone help you define the margin of error on the scores? The higher the margin of error the less reliable the score is.

    2)Can you find a statistician to help you identify the reliability and validity of your reports? this is far more complicated but can be done. It would show that not only do your results remain within the margin of error but that they can be repeated and still come up with the same results.

    The idea of finding a control group would also be very useful. It could be the Championship or maybe another professional League in Europe. Saying this I realize that your resources and time are very limited already but I am putting this out to all UA readers and bloggers. Maybe a few will step up and help out. My skills are not up to it nor do I have the possibility of watching matches regularly but I would be willing to do what I could to help out as well.

  16. Guys, big fan of your reviews. However, if you want to see whether you reviews are unbiased, why not get an independent referee to review a random sample of your decisions, say 25, and see whether they agree with you. Might be best to focus on only Arsenal games as these would be likely the ones most open to bias (although you could argue that Tottenham games carry a similar level of partiality).

    Also I noted that the BBC ran an article on your work a while back, any plans to try and get a summaried version of your findings put into an article?

  17. Also, Walter, is it possible to work out the bias number that proves to within 95%% certainty that their is bias in the results? (assuming all the reviews are correct).

    i.e. if you flip a coin 1000 times, you don’t expect to get 500 heads and 500 tails, but if you get 800 heads, it’s highly probable that the coin isn’t evenly weighted.

    So over 38 games, you wouldn’t expect a bias number of zero, but what is the level where there is less than a 5% chance of delivering the result with unbiased refs.

  18. Chris,
    the person who wrote the article based on Untold referee previews and reviews doesn’t work for the BBC any more but he now works for CNN

  19. Of course getting us mentioned on CNN… mmmm would that be bigger than the BBC ? 😉

  20. Chris, Walter,
    Having an independent referee do an analysis of a random sample means that the sample should be random. If someone going in knows that all of the sample are Arsenal matches, then it would contribute something of bias either way (or so it could be argued). Perhaps the members of the sample could be chosen by a draw from numbered papers, each number linked to a particular match. Those 25 or whatever would be the random sample and the referee (someone truly above and beyond reproach from perhaps not the EPL) would set to it.

  21. Dom, Dogface,
    As I don’t know, but honestly ask: What would a control group actually mean for this type of study? What would say doing a season of La Liga accomplish? I don’t see the comparability. We’re not measuring particle behavior to achieve an outcome that is purportedly universal. What then if another league proves bent or not bent? What’s the relevance to anything outside that league at the time of the study? where’s the comparability?

  22. Walter, Dog,
    Again, let’s brazenly include Arsenal in this study:
    What is the statistical likelihood that AFC would not have a single penalty shot at home in the course of this past or any other EPL season? How great an anomaly, statistically speaking of course, would that be? Did any other team experience this? this season? last season? any season? Youth wants to know, so please heed this request or say it can’t be measured. Hmmm?

  23. p.s. Walter, Dogface,
    I think that if number of penalty shots, at home, league-wide/per team, in any given season can be counted (or whatever number can be culled from the current database), then how far from the average would Arsenal’s zero penalty shots be? And who were the attending referees when there were non-calls that should have been penalty shots according to your ref reviews? This could say a lot, methinks, about the state of Denmark, and to what heights (or depths) the odor has reached, statistically speaking. 🙂

  24. I can’t answer these calls for more research, control group, non-Arsenal supporting refs etc, but I can add a little more.

    First, to answer one question, no I don’t know anyone who is quoting us. If anyone cares to write into newspaper blogs, and indeed other blogs quoting our figures and a link over to us, that will help.

    There are also many articles that are relevant on wikipedia, and careful cross referencing within the rules of wikipedia would again help others find us. If you are a believer in what we are trying to do, and you would like to help, please do give us publicity in both those areas.

    Second, I am exploring a way to recruit non-AFC supporting refs, and of expanding the number of games we do. It is complex and I need to avoid costs (in reality our only sources of income are clicks on the adverts, and purchases of the Woolwich Arsenal book). But it might be possible.

    But please do remember, the level of work Walter and co are undertaking is enormous. No one is saying we have proven anything 100%, but there is more than enough here to make the unbiased person question what is going on, and to make PGMOL start taking notice.

    The silence of PGMOL is probably the most interesting thing of all.

  25. pps. Walter, Dogface,
    An addition to my 4:19 would be not only non-calls that should have led to penalty shots, but yellow cards that should have been red cards and thereby have led to penalty shots. Any slice of this, imo, would help establish an anti-Arsenal bias in statistical terms. This anomaly (outrage) has already been noted on UA, more than once as you and others have. In the current context, I think it would go farther in synergy with the rest.

  26. If I remember right, didn’t one of the bigger media outlets (was it the bbc?) pick up on untolds work and publish an article about it. Considering there’ve been some great findings in past weeks I’m surprised they havn’t taken notice again. Maybe an anonymous drop is in order in the hopes that it’ll interest someone?

    Maybe some kind of online marketting will help as well. We could visit other popular blogs, both arsenal and non-arsenal, and place a call to arms of sorts to see if anyone would be interested to help from a referee perspective?

  27. iniez,
    Walter noted that the BBC writer is now with CNN.
    As for publicity, anyone could post comments at other blocks with a link to the relevant article. If a fraction of the readers each did that once, there’d be publicity.

  28. bob,
    Maybe something like a pre-written message will make it more organized. Also I wouldn’t want to do anything that tony and co wouldn’t approve of, this is their blog after all. But this could definitely help spread the word. We don’t want to end up spamming other blogs but getting the message out will surely get someones attention. Maybe even get more traffic through the site

    Tony, Walter, Dogface
    What do you think? Those of us that can’t help with ref reviews can maybe act as your messengers instead?

  29. Bob & all,

    My suggestion on an earlier article was to write a brief overview and highlight each teams bias figures or some eye opening data (except Arsenal maybe) whilst appealing for more refs and ask other blogs if we can have a guest post.

  30. Stuart, iniez,
    Personally, I don’t think bias figures out of the blue will be persuasive. What will they mean without the articles. Also, I think that a standard boilerplate message will definitely feel and read like spam. Imo, people here know their hearts and have good minds and will know what to write, and provide what info speaks to them and choose they want to advance (depending on the context each time). It really doesn’t need to be (and, imo, should not be) so top-down organized and definitely not standardized. In any case, Tony (4:50) did call for people to provide links anywhere suitable: “If anyone cares to write into newspaper blogs, and indeed other blogs quoting our figures and a link over to us, that will help.” So, there you go.

    It’s all about publicity and the aim is to get people to read this stuff hereabouts, and then to make up their own minds. There’s data now in hand, a great case has been made (with more to come), so do feel empowered if you’re so inclined.

    As a coda, I hope that Walter/Dogface will be able to provide people with a powerhouse statistic on the likelihood of any club not having had a single penalty kick in 19 home games (that would be 1710-1786 minutes without!). It would be nice to toss a stat like that around, here and there, just as a a wake-up call. If it can happen to us, it can happen to others as well.

  31. English refs used to have the highest worldwide reputation- not only for competence but for integrity. My europe-wide experiences now tell me this is no longer the case and PGMOL must take some, if not much, of the blame for this. If our PL refs were as unbiased as Untold reviewers- we would not be having this debate. Well done team for proving your professionalism. Your findings are ‘right’ because they fit in with reality. The bias is regular and persistent, involves more than Arsenal and has been going on for 3 seasons or more.
    To all doubters I say this. If you cannot see the obvious– ‘get a paper and pencil and start doing some matches your yourself. What is obvious to most people will soon be made clear– even to you!

  32. Great stuff Walter and co.! I have a concern with your avg. % difference though. If you did a simple average, then you added positive and negative numbers, which usually ends up near zero. It is often more insightful to use absolute value or squares and square roots (before summing and dividing your data) to see the real magnitude of change.

  33. On the basis of “Ask him, he can only say no” why not try to get Geoff Winter involved? http://www.jeffwinterentertainmentandmedia.co.uk/

    And whilst trying, David Elleray also. Neither particularly liked Arsenal, (in fact Elleray mentioned that Nige Winterburn was his least favorite player) though both to my mind were great refs, so they’d be admirable judges.

Leave a Reply

Your email address will not be published. Required fields are marked *