Phantom matches and why Arsenal have their own data company

By Tony Attwood

Back in 2007 when Untold started I was exploring all sorts of avenues of interest that I thought this site could follow.   One involved statistics.  I’d watched a match at the Emirates and looked that evening at the stats that appeared on some web sites.  Although I’d not made any notes at all during the game I just felt that the stats about corners, shots on target etc were not at all right.

The following match I took a notebook and did my own checking.  Sure enough I got a different number from that reported.  All the papers that covered the matter had one figure.  I had a different one.

So who would you believe?  Obviously the unbiased independent stats company.  The one company that gave statistics to all the clubs.  An Arsenal fan recording the statistics at an Arsenal match – bound to be biased.

But even though I tended to disbelieve my own figures I began to do a little sleuthing of my own, finding out what the companies who gathered statistics did and how they gathered all the data.

They were of course varied, and they did all sorts of things as well as telling us about the possession levels, number of shots on target and the like.  They also reported on players and matches in all sorts of places I’d never heard of.

Then in June 2008 the Daily Mirror reported that Peter Crouch was signing for Arsenal, and I began to think – if I know that this story is completely made up, what about all these stories that I can’t check so readily?

Time passed.

In January 2009 The Times published a list of up and coming giants of the game and included at number 30 (and I quote exactly)…

“Masal Bugduv (Olimpia Balti)

“Moldova’s finest, the 16-year-old attacker has been strongly linked with a move to Arsenal, work permit permitting. And he’s been linked with plenty of other top clubs as well.”

After that more stories emerged.  His transfer value went up.   He was going to go to Zenit St Petersburg because Arsenal were being too slow and didn’t want to pay the full amount.  He was already playing for his country.

But in the end it turned out that the player didn’t exist.

And then something else happened.  The Times just removed all reference to the story from their web site.  No apology, no update.  Total wipe clean.

So I began to wonder further about what was true.  And I began to realise the problem.

A lot of the “research” and “analysis” companies were, I realised, producing figures that were operating in a half-baked way without security.  They used very low paid people to watch matches on TV and record the stats.   Stories abounded about people nipping out for a drink part way through and then averaging up their statistics from the bit of the match they had seen.  Some didn’t even make it to the match.

After that we had the tales of matches that were analysed (earning the employee of the statistical company a few pounds) and had bets placed on them (earning the match fixers some money in bets placed).  But the games were not even played.

Since then phantom matches have spread – but the media has remained in sturdy denial, simply because they too use the services of the statistics companies that pay peanuts and get back make-believe.

But now the problem is getting so out of control that even the sturdily resistant British press is acknowledging that not all football matches happen.  They are not admitting that their analyses might be wrong, but they are starting to tell us that the system is not quite as sturdy as it might seem.

For in the Belarusian Premier League a friendly between Slutsk and Shakhter Soligorsk was reported.   To everyone’s surprise Slutsk won 2-1.  To not very many people’s surprise the game didn’t actually happen, but was a phantom set up for the purposes of getting one over on the increasingly dopey gambling companies whose rule of thumb is “if it moves, accept the bet”.

It is good that some sections of the UK press have picked up on the story, and even better that some of the articles contain a reference to the fact that football gambling is a form on money laundering.

If you don’t live in England you might not know that in many towns our once thriving High Streets are now ghostlands consisting of fast food outlets, charity shops and gambling dens.   The money launderers walk into one gambling shop and play the new high stakes roulette games, betting on black and red.   They put in dirty money, at the end cash in clean money.    I’ve even heard of them working in gangs of three with one betting on the home team to win, one on the away team to win, and in the third on a draw.   The purpose of course is not to make a profit but to walk out with a modest loss but with clean money.   The estimate is that getting on for £100 billion is laundered each year.

But the real big recognition is that at least one newspaper has now said the phantom games are not just created by gamblers but also by those supplying information to football data companies.   Their aim is either to help the gamblers, nor to mislead the clients of a particular firm into chasing second rate or non-existent players, but simply to supplement a very low income.

For it is not only the results of these games that appear but also the reports of the matches, the statistics and the performances of players.

The problem that everyone has here is exactly the same issue as with referees.  There is a huge disparity between what the people involved in the game are paid, and the amount of money churning through the football betting agencies.  If a data agent is paid say €25 a session and someone says, “tell your bosses that you went to this game and hand over this report – and here is €2000” the temptation is enormous.  The cost to the fixer is nothing compared to the money circulating.  If the data agent runs away with the money, the loss is nothing.   Quite often the fixing agent might pay a dozen data agents in the hope that a couple will come up with the report.

I still don’t know if my analyses of matches I tried to record in my notebook in 2008 were right and the data agency’s reports were wrong, or not.  I’m not going to do the exercise again because I have other things to do.

But I do think we have here another very good reason for Arsenal buying StatDNA.  The club has total control over the whole process now, thus reducing the chance of any interference from without.  I can only hope they are paying their data agents a decent salary.

Of course I don’t mean that Arsenal would be as dumb as the Times in believing in a non-existent player, but they can at least avoid wasting a lot of time, in that regard.  And perhaps more importantly, when they come to play overseas, they will know that the information they have on the home team is valid and that the home team is not reading Arsenal’s analyses.

The books

 

18 Replies to “Phantom matches and why Arsenal have their own data company”

  1. Another interesting read Tony
    .
    I always wonder how correct some of the data I see after some matches are especially if it differs from my perception of the match. Now it makes more sense to me.

  2. The stat that I have often found most intriguing is the ball possession data. I believe that most of it is made up, borne out of perception of the observer on which team is pressing which. I humbly request enlightenment on this issue from anyone who knows.

  3. Do the PGMO produce their own stats or do they use one of the stat companies to do it for them. One wonders who would do the worst job.

  4. @ Untold Arsenal

    I am working on Fa Cup and Champions League Referee analysis (along with video of each wrong decision for Arsenal and against Arsenal).

    For the Premier League I will follow the Referee analysis conducted by Untold Arsenal and make video of the wrong decision for Arsenal and against Arsenal.

    For Example, As there was no referee analysis for the Champions League games I made little one just to fill in some gaps 🙂

    https://docs.google.com/document/d/1NceKz-6t37tEE0y31dzjaLU4aigV9GyUnTIb4ylNKVE/edit?usp=sharing

  5. Fascinating, Tony. Your description of the average British high street is all too accurate. Your description of money laundering on the high street is an eye opener to me at least. So that’s why we have so many gambling outfits.

  6. Boo – I believe, although could be wrong, that the possession stats include the time the ball is out of play. I think, typically, the ball is in play around 2/3rds of the time. So if the opposition keeper is taking an age to take a goal kick then all that time counts for his team’s possession number.

  7. The media stats are a complete waste of time and bear little relationship to what has happened in the game.

    The only true stat is the final score.

    I am sure that a numerical breakdown can and is useful, but only if it is specific to the club concerned.

  8. One of the approaches to automated help in determining if a goal had been scored, was to put a chip inside the ball, nominally at its center (however that is defined). A person could extend this for determining ball over the
    end lines, ball over the side lines and ball over the midline.

    But, if there was a chip in the center of the ball, a chip in the center of the center of the shoe under the base of toes, and a chip mid sternum (chest) on a player, it would be possible to determine which player the ball was closest to. A chip at the base of toes of each assistant referee on the sidelines, might be useful in offside (or referee assessment).

    And other ideas as well.

  9. @ Gord

    Too many chips one would think. Some of the like a chip on the head and chest are not feasible.

  10. I was wondering how the heat maps are done for each player, i mean how does the player get identified? Is there a chip on the player somewhere?

    And then stats for each player passes and movement. How is it all done?

  11. Players identified by video I would expect. They possibly do this once per minute, or they may do it more than that, and then plot average positions per minute.

    I would at a minimum, identify each player’s position at 30 second intervals. For each triplet of points (X-30 seconds, X minutes, X+30 seconds), I would find the radius of the circle that best describes those 3 points. At the center of that circle, I would print a colour gradient circle, where the characteristic distance over which the colour changes (Gaussian gradient) is 1/3 the radius. Repeat for all the data. A Gaussian is that “bell” curve that a lot of people call “normal”. Exp( -1 * X^2 / sigma^2 ) / normalization.

  12. too many chips GORD, wonder if someone gets hungry mid game.

    PJs aside, Cricket has LED lights inside the bails. And the stumps. When the bails are dislodged the LED lights up.

    Also, we already have these warning systems in traffic signals. To warn cars that have crossed the white line. Its very sensitive. And of all the places, Bangalore has implemented it. And people who’ve been to Bangalore will know how pathetic the traffic situation there is.

    Simple thing to implement first are the pitch boundary sensors. The ball crossing the line, which can be used to call the goals too. You dont need a separate goal line technology for that. But the simplest thing that they can do is make the referees accountable. How much does it take to do that.

  13. In answer to Bootoomee I have read but cannot verify from my own knowledge that possession is determined by the number of passes each team makes. So add the passes of both sides together and the percentage of each will be the same as possession. I have found this an exact correlation on the couple of times had the opportunity to check but in truth that put me off possession stats themselves. This method seems to me absurd but it has the benefit of being simple and based on something already being measured. It is of course wrong because neither the mazy dribble nor the goalkeeper’s time wasting would be included whilst both are in reality forms of possession.

  14. Vintage Gooner, it comes about from how possession is defined. If a person from team A kicks the ball, the ball is still deemed to be in possession of team A until team B touches the ball (at a minimum). If players from both teams touch (and fight over) possession of the ball for some length of time, possession still resides with team A until team B establishes control over the ball.

    While incidents where players from each team battle to get control of the ball, they typically don’t take very long. Adding them all together is a time much less than 90 minutes.

    If a person from team A kicks the ball towards the corner of the pitch, and it stays in bounds, and no player goes to get the ball; is either team in possession of the ball? According to the above definition, team A is still in control, so they still have posssession.

Leave a Reply

Your email address will not be published. Required fields are marked *