Statistical analyses in football: an interview with the Professor



By Phil Gregory

I was recently fortunate enough to be able to do a short interview with Bill Gerrard, Professor of Sports Management and Finance at Leeds University. Professor Gerrard is heavily involved in the game, having worked with top sides in fields as diverse as squad valuations and performance analysis, while advising the board of an MLS franchise too.

I’ve always been interested in the numbers behind the game, and they consistently surprise and impress me. Whether it’s the 100% pass completion of one Denilson Pereira Neves playing in a side with ten men or the ProZone numbers I saw that showed “lazy” Dimitar Berbatov covered more ground  Carlos Tevez, the numbers allow us to see what we cannot see through TV.   They also allow us to look past our preconceived notions of what is happening (simple prejudices, as every football fan will be all too familiar with in regards to particular players).

Statistical performance analysis is a relatively new concept in football, a sport that has traditionally been fairly slow to take up new ideas. Football is also a fairly tricky sport for performance analysts to get stuck into, typically 12 hours are needed per 90 minutes of footage to dissect and record every action a player makes. With 380 games taking place in the Premier League alone this season, the difficulties are obvious, but even so the availability of the data is far greater than the ability of teams to analyse it. While it’s not a holy grail, and won’t pick the “best” team for the manager it certainly offers insights that will allow them a competitive advantage over closed-minded competitors.

With the England squad only recently announced at the time of the interview, we first discussed the team itself and the difficulties a team game offers someone who sought to rate players by stats.

A sport like baseball is much easier to analyse, a hitter’s performance does not depend on the performance of his team-mates it’s just him versus the pitcher and fielders. In football there is significant inter-dependence of players meaning that raw data often needs to be subjectively considered in the context in which the actions were done.

Darren Bent hit 24 league goals compared to Defoe’s 18, but how do we account for the fact that Bent was playing for a weaker side.  How do we know who had the better service? Was either man the sole target through which all attacks went? It is these questions that limit the role that data has to play in team sports such as football.

Professor Gerrard also made the point that comparing who is the better of two players is often of limited relevance when considering the team game of football. I personally would argue that Bent is a better individual player than Heskey, but the latter is in the World Cup squad for what he offers the team (Wayne Rooney’s goal record with and without Heskey pretty much guaranteed the Villa man’s selection).

So how do performance analysts go about rating a player or team, I asked.  Professor Gerrard stated that that is wholly dependent on the structure of the team in question, and how they play the game.

Completed passes, for instance would usually be a good indicator: if a team is making and completing more passes, they are using the ball well which is bound to correlate to victory to some degree. However a side such as Bolton under Allardyce would rate a pass as successful if it was (for example) a defender’s forward pass taken long and then headed out for a throw-in, whereas the statistics would mark it down as an incomplete pass. Allardyce would disagree and argue that the pass was successful, as they have possession of the second ball, as well as significant territorial gain.

If it were Arsenal that were being rated (and indeed, this is the case for most of the rest of the league) pass completion percentages and the number of completed passes would be a fairly good guide for team performance. Arsène Wenger places significant emphasis on the tempo of the game (the time spent on the ball before it is passed on); many other teams don’t  (“we lacked sharpness in the final third”, that ever-familiar line!). Different statistics offer an insight into different teams’ performance.

I then asked Professor Gerrard whether everything in the game is really measurable. I’m sure we’ve all spoken to a Liverpool fan who has claimed that other players give an extra 10% when they’re playing next to Steven Gerrard (perhaps not quite so recently…) but how would a performance analyst measure this? Is it just a myth or is there tangible evidence in the numbers?

While we didn’t have data to hand to confirm or deny this particular theory, Professor Gerrard pointed out that such things are ultimately measurable, as they are likely to affect measurable statistics. Do the players cover more ground?  Do they tackle more? Do they spend less time on the ground “injured”?

He did offer a word of warning however.  Such effects are not always positive, and indeed might be counterproductive despite our subjective viewing of the game making us believe that they were otherwise. The player in question may dominate and intimidate the side (think Henry in his later years with us).

Do players pass the ball to that specific player more than another alternative forward, even when it is not the best option (as Anelka once famously alleged)? Do players spend less time on the ball and complete a lower percentage of their passes, i.e. are they rushing? Or do they cover less ground, do they tackle less knowing a star player will likely win them the match? Hard data can allow a manager to have complete confidence that a player is under-performing and allow him to challenge the player on what may even be a subconscious, (“he’s here, so I don’t need to try so hard”) approach. The most simple example, easily measurable by the fan at home is win percentages: are they higher or lower when the player in question is fit and playing?

The use of data and statistical analysis is risky for managers. Football is an inherently conservative industry and there exists a certain distrust of numbers. If a manager were to use data and statistics significantly and then fail to produce results they’d likely find themselves in a weaker position than a traditional, close-minded manager.

Professor Gerrard referred me to the example of Chelsea a few years back, when the midfield under Mourinho was the highly effective Makelele-Lampard-Essien axis. By looking at the passing statistics, teams saw the importance of Makelele to making that unit work, so aimed to limit the Frenchman’s influence on the game. Naturally, focussing heavily on one of the players ensured the others had significantly greater space and time on the ball, leading to an improvement in the performance of the other two which ultimately compensated for the reduced impact of Makelele. Such is the limitation of performance data in football: it can only be used to better inform tactical systems and strategies.

Professor Gerrard also spoke of certain statistics being more valuable to an onlooker than others. Any defensive statistics, whether a successful tackle or interception, decisively deny the opposition an opportunity to score, forcing them to first regain the ball. Possession statistics don’t have the same effect in ascertaining a team’s chance of winning.

Pass completion better correlates to not losing the game, which seems logical as a team that has the ball is effectively denying their opponent any opportunity to score until possession changes hands. This possession alone however does not necessarily translate into goalscoring chances, and therefore high possession doesn’t consistently correlate with high win percentages.

If one looks at the passing statistics for the entire Premier League, there is a rough correlation with final league position and passing ability. However when  the top four are removed, positions five to 20 show very little correlation: ideas of efficiency with the ball (as well as the previously mentioned conundrum of what is a successful pass) come into play. To properly assess a teams chance of winning with offensive statistics, more refined measures are needed, whether shots on target or chances created.

We then moved on to what I took to be a positional bias in certain performance analysis models. A quick glance at the top 20 in the Castrol Ranking reveals fourteen forwards (which comprise the entire top ten) and only three defenders, two midfielders and a single goalkeeper. Such a positional bias was also noticeable in the Fink Tank model, though less pronounced.

Professor Gerrard stated that his approach was to view a football team as a chain when it came to scoring goals. A team will score more goals if they have defenders who are good on the ball or a goalkeeper with excellent distribution, but ultimately what is most important is the final link in that chain, the forward. Hence any model exhibits a certain degree of bias towards the player who actually puts the ball in the net.

We moved on to discussing performance analysis and how they related to the transfer market. If we consider how players are valued, there is certainly a bias towards players how score goals. All things being equal, a forward will cost more than a attacking midfielder, who will cost more than a defensive midfielder, who will cost more than a defender who in turn is slightly more expensive than a goalkeeper. I asked Professor Gerrard whether the bias in for example the Castrol Rankings toward  “goal-getting” players as well as the higher transfer fees for such players acts as the market proving that these goal-getters are of higher value than their team-mates.

He disagreed, arguing that in fact they may be overvalued by the transfer market. He argued this by pointing out that if we consider shots on target, a striker’s shot conversion rate (shots on target : goals scored as a percentage) plus the opposition goalkeepers save percentage (number of saves made : shots on target as a percentage) will equal 100% (ignoring the odd block by a defender or beach ball intervention).

Hence, going back to the theory of that those at the end of the “chain” are the most important, we must consider there are two “chains” per team: an offensive chain (goalkeeper to defence to midfield to attack) and a defensive chain (striker to midfield to defence to goalkeeper).

So, if we are to argue that strikers should be highly valued for their ability to score goals, goalkeepers and defenders should be equally highly valued due to their ability to prevent a goal. After all, a goalkeeper saving a shot that would otherwise have gone in has, on the scoreboard, made the same contribution as his team-mate has who scored a goal.

Through this, Professor  Gerrard argues there may exist what economists would call an inefficiency in the market; something is undervalued compared to what it gets you for cost. Hence we can see why many teams that hit above their weight (the Blackburns, Stokes and Boltons of the past few years) are generally defensive sides: every pound spent on wages/transfer fees for defensive players will generally lead to more points gained than an equivalent spend on offensive players.

—————————-

Before you comment please read this

Next season’s squad: we’ve signed four, the fifth is identified, and the sixth would be another steal from Barca if it comes off.

The Untold Index

The Index of Very Ancient but still Quite Interesting Things

What to read in between World Cup matches

16 Replies to “Statistical analyses in football: an interview with the Professor”

  1. do you have a statistic that shows if rooney has ever beat another player with a dribble.
    i have never ever seen rooney dribble round another player.
    oops, nearly forgot, he always dribbles when he gets sent off.

  2. What a brilliant article. Very interesting and a great read. I recently came across those Castrol rankings, but immediately dismissed them when I saw that Cesc Fabrigas was only ranked something like 34 (although he has moved up to 25 now I see). But statistical analysis of footballers performance is something I also have become a bit more interested in of late. I, probably like many others, didn’t believe they had that much place in football really. I have always viewed football as a fluid, simple game, hard to quantify, where traditional teamwork is the key to success (unlike American sports, which in my opinion are all stop / start and based largely on individual events or “plays” which is why they are so stats heavy – sorry to offend any Baseball and Gridiron lovers out there, but can any sport that had to invent cheerleaders to try and drum up interest from the crowd really be that good?), but I watched an interview with Arsene on BBC’s inside sport a year or two ago, and saw that Arsene uses statistical analysis constantly to evaluate his players and how well they are functioning within the team (some of which have been highlighted in the article). So I thought that if they are good enough for the great man himself, then they are good enough for me. But I think you are dead right in that it is how the stats are used and also what particular stats are used, that is fundamental to getting the best out of them.

  3. I must say Phil this is something that is food for thought and I need some time to think this over. A very interesting article I must say. Thanks and I will be back later with my analysis of the analysing. pfew…

  4. Man I love this blog I always like things that make me think or improve my knowledge … I’ve always told people that although I’m not the biggest fan of Denilsons I always thought that he has a mean long range shot which has won us games this season but I have also stated that the problems I see him have are all easily curable (giving away to many fouls in his own half and giving up after loosing the ball).

    My point is this Wenger is probably using stats like this to select him which now makes sense havent always been denilsons biggest fan so would always reply the boss knows what he is doing now, no more I can just say he is using statisticle analysis.

    Only thing I’m scared of is if Wenger leaves before he is done doing what he needs to but I guess who ever takes over will really be an idiot to mess things up cause Wenger has done all the work already.

    ps le Grove’s editors are idiots

  5. aaron: the issue with stuff liek Castrol is it tries to be purely objective when rating football. But as it says in the article, it really doesn’t work like that. Bolton under Allardayce and Wenger now are basically playing different games, and cannot be compared.

    What Mr Gerrard does from what I know of his system of player ratings is add an element of subjectivity ie how does the side play? How does the payer play? The answer to those two questions defines the analysis criteria, and you end up with a more accurate overall model.

  6. Ah.. the final paragraph hit the nail… I’ve always wondered how stoke fans can wake up every saturday morning and say “Yay! its the game today!” and trudge to the brittania or worse somewhere far off..

    Great article. I feel football is a bit like Quantum Mechanics (no no not Tony’s Quantum football),the measurement of one parameter affects the other one, you cannot make a complete picture of the game with parameters measured in isolation. So football will always remain an intuitive sport and success will be determined by the “gut feeling” of the players and the manager. But this is not to say that we have to be close minded and shun technology, but instead use technology to train that intuition.

  7. “Hence we can see why many teams that hit above their weight (the Blackburns, Stokes and Boltons of the past few years) are generally defensive sides: every pound spent on wages/transfer fees for defensive players will generally lead to more points gained than an equivalent spend on offensive players”.

    Not clear to me why this will be so of necessity.

  8. Its only a necessity for teams who’s management team hasn’t got the imagination, insight or belief to overcome their fear.In theory that’s why they cannot create real balance and they sway to the defensive aspects without hitting true heights through aligning offensive attributes with defensive attributes.Italian football give faith to this and european football to counteract the flair and success of more adventurous truely balanced teams.In my opinion its the difference between a robot and a human. A computer and the human brain. The computer has predetermined processes A human has the potential to surprise.Its based on fear.some drop back into their comfort zone to get those consistent if not ground breaking results. Ultimately though its based on human nature which we view in everyday life. some humans can be overcautious.some humans may not be cautious enough and some are balanced in their approach. Personally I support arsenal for the balanced approach even though the season just gone I think they fell into the not cautious enough section.In my opinion this is down to the formation . obviously there is other reasons but I think the fundamental flaw for last season was the formation and the indiscipline within it tactically and individually.Lets hope for an evolution next year. As it was there first crack at the formation last year they did okay. now for them to do great things.

  9. I feel sympathy for the rise of the defensive football team. It is the markets vagaries that has allowed them to flourish and the twisted inequality of finances that is at it’s heart.

    If market inefficiencies mean that for what limited finances you have you get more bang for your buck buying good defensive players and playing to then playing to team’s strengths how can you be blamed for that?

    You can rarely buy top class striking talent and when you do somehow find it it is almost impossible to hold onto.

    Unless you have a visionary like Wenger it is hard to compete and it isn’t fair to judge the policy clubs like Stoke adopt against the approach Arsenal have taken. I am sure Wenger could achieve great things at smaller clubs and play attractive football to boot but the point is Wenger should not be compared to other managers – in this respect it simply wouldn’t be fair.

  10. P.S. What really pisses em off are the clubs like Rafa’s Liverpool, who had money to burn (and did) but played shit football anyway. Unforgivable.

  11. 0.9 Calibre: Indeed. certain sports (like baseball or cricket) that are just one man versus a team can be pretty much played via the data. Football isn’t like that, it depends on your role within the team and your teammates actions.

    Take Cesc for example. He’s probably the past passer of the ball in the leagfue, but he doesn’t have the best pass completion % in our team! This is due to his role (he plays the harder passes, of course). If you put Cesc in Bolton’s team, he would have a lower pass completion despite being the same player. Why? Because his teammates wouldn’t be making the best off the ball moves.

    Ole Gunner: not sure what your query is with, are you just disagreeing the last paragraph?

    meditation: Perhaps. But if you look at the transfer market, world class defensive players would be much cheaper than world class offensive players, and yet some would argue the two have an equal impact on the final scoreline (a goal prevented has the same impact as a goal scored when considering winning or losing).

  12. Agree with you there, Jonny. Clubs like Stoke and Blackburn will also look to get more bang for their buck by exploiting the rules – going for bruisers as these get penalised les than they should in the EPL, so again are mor effective for cost. Ultimately, it’s distasteful, but we can’t blame a club for maximising their own interest – we should blame the EPL authorities for allowing such play to take root and seemingly flourish.

  13. Well said Phil. Managers get away with all that they can as far as the application of the rules is concerned – and all that they can within the constraints of club finances. Except, of course, if you’re Harry Redknapp who will happily bankrupt you.
    As far as analysis goes football faces the dual difficulty of being perhaps the most fluid of sports and also the one within which scoring is the most difficult. Luck plays a part in all sports. In cricket, winning the toss can be fundamental to the outcome of the game while in football which way a ball rebounds off a post can also decide a result.
    Great managers do everything they can to minimise the effects of luck by studying the stats in order to shift the odds even marginally in their favour. It’s when their actions are outside the rules of football and/or business investment logic that leads us to the situation that football is in now.

  14. I know that Big Sam Alardyce gets his fair share of bad press from Arsenal fans, but I think a lot of it is a bit unjustified. Yes, his teams play hard and are organised, but he’s not the long ball merchant that everybody tries to make him out to be. Don’t forget that this is the manager who brought JJ Okocha to the premiership – one of my favourite players these last few years. He also took a chance on Mario Jardel, who was potentially one of the worlds best strikers when playing in Portugal (I think personal problems wrecked his career ultimately). Stelios was another good player and Campo was an excellent footballer and very cultured. His Bolton side played some good stuff but had a hard spine with Nolan and Davies etc. I think what Big Sam does is try and get his teams organised for a couple of years, and then when they are established, he adds a bit more of attacking flair, which is a fairly good way to do things. Blackburn are now Ok under him and he is now trying to get good footballers like Guti into the club. Newcastle should have stuck with him IMO. They would never have been relegated if they had. Tony Pulis on the other hand I have nothing but loathing for. I’ve never seen any of his teams play a decent game of football, in all the years I’ve been watching, and he seems to even rub many of his own players up the wrong way and publically too- like Beattie, Tuncay, and Kitson for example. I also doubt that he even knows what a statistic is either and would guess he has never used them.
    I also think that the market is also now equalising somewhat in terms of offensive and defensive players, so this is maybe a sign that more managers are using statistical analysis to assess all areas of the team. For example, to buy Maicon today, you will probably need about £30m, Defensive midfielders like Macherano and Toure will now cost £25m+ Vidic would also cost about £25m. These prices are in the realms of what it would have cost to buy the Ronaldiho’s and Shevchenko’s a few years back, yet are now accepted as the going rate for the best defensive players.

  15. Amen to that Jonny. Its neither the managers fault nor the boards. Its clubs like Madrid, Shitty and Chel$ki that push the distribution to one extreme.. And yeah Liverpool does play like shit but thier fans are bearable so I dont give them grief! The Mancs get it the most.. ofcourse I dont live in england so I dunno how the “Scousers” really are!

Leave a Reply

Your email address will not be published. Required fields are marked *