By Tony Attwood
If you are a regular reader you’ll know that I sometimes have a little snigger about the way the daily press in England pick up on our stories occasionally and then do a re-write of them – either covering the same ground, or being persuaded to set out an alternative vision by whoever it is we are having a bash at.
But of course sometimes it happens the other way around. I’ve been plodding along for the last couple of days working on a piece about Arsenal goalkeepers, in the light of the Never Ending Story about who we are not going to sign, and then up pops the story in the Telegraph.
The question to be answered is, “who has been our best keeper?” which might give us a clue as to whether Mr Wenger is good at choosing keepers or not.
Now what delayed me in writing this piece was the gathering of data and trying to decide how to evaluate a goal keeper. The Telegraph however decided to keep it dead simple by looking at the number of clean sheets each keeper has kept per game played. So they won the race, and as my figures are not really leading to an outright conclusion, I might as well take theirs, and add a few snippets of my own en route.
The answer they came up with was of course David Ooooospina. They say of him, “The Spaniard has the best record for the club when it comes to clean sheets, and also the best for goals per minute conceded – 150, no less. Maybe he wouldn’t be a bad long-term option, after all,” which is pretty much what quite a few of us have been saying.
Second place goes to Alex Manninger.
Alex was an interesting chap. In 1998, for example, Manninger was given the Carling Player of the month award. He was then promptly dropped – as Seaman returned from injury. He also had a run of eight consecutive games without conceding at that time, which is rather good.
(Incidentally Alex eventually left us for Espanyol, who then, having signed him, suddenly announced they didn’t have the money to pay for him).
Then the Telegraph goes off the rails a bit by including John Lukic, but only including his second spell with us. During that spell he played just 15 league games in five years, so I think that is a bit silly.
Of course you’ll be wanting to know where David Seaman got to, and he is in fifth. They say, he “Has the second best record when it comes to minutes per goal conceded – 109.2 – but not quite as strong when it comes to clean sheets.”
And this shows where all these stats have a problem. David played 405 league games for us starting in 1990 which means he played in the Almost Unbeaten Season (under Graham – when we just lost one match) but also in the latter days of the Graham era which were pretty awful.
Almunia’s next with 39% clean sheet ratio across 109 games, followed by Fabianski over a much smaller number of games. Incidentally I thought Fabianski was offered a new contract by Arsenal but turned it down, although these days everyone knocking Wenger uses hindsight to say that’s not the case.
Next up is Wojciech Szczesny with 36% clean ones, which is the same number of Jens Lehmann. But this really emphasises where this sort of analysis goes wrong because Lehmann is the only goalkeeper in the history of the football league since 1889 to have played in every match and never played on the losing side. That surely says something about a keeper.
All of which says something about how hard it is to evaluate keepers I suppose.
But still, it’s quite fun to look at the stats.
(Well I think so).
————————–
Follow us on Twitter @UntoldArsenal
Surely clean sheets are in no way a reflection of a good or bad GK. EG a GK who faces one shot per game and keeps 50% clean sheets is considerably weaker than one who faces 10 shots a game and keeps 40% clean sheets. Similarly, if a bad defence allows a GK to be regularly exposed to one-on-ones, this will impact.
I’m thinking off the top of my head now but surely some sort of combination (I’m sure Gord can help here) of saves v shots on target (ideally from inside and outside the area with some sort of weighting for the former); percentage of crosses caught; one-on-ones saved; and distribution would give a better idea.
Just a starter for ten…………but certainly more thought pit into it than those so-called “journalists”!
For clarification, a “pit” is what the “journalists” should be put into 🙂
I don’t think counting blank sheets is a useful way to judge a goaltender. For a few reasons:
1) If the other team doesn’t shoot, they probably won’t score.
2) The lack of goals scored could be due to the other 10 players on the team.
3) In many ways, scoring 1 goal is the same as scoring 0.
4) The goaltender might be good.
We already know there are problems getting any meaningful data out of a single season, so unless a goaltender has more than 38 games, it probably isn’t worth trying to judge them.
Having a single goal scored against, is actually split into two events:
1) A long interval of time where no goal is scored.
2) A very short interval of time, in which the goal is scored.
This definition makes use of the calculus, which you don’t want to see. But consider that the goal is scored a very short interval of time after the 90 minutes expired. Now the probability of a goal being scored depends on the accuracy of the referee’s timekeeping. But, scoring 0 goals is much the same as scoring 1 goal. Or being scored on by 0 or 1 goal.
You want to try and look at all the data, instead of just making a small amount of data even smaller. You can try to ignore the presence of the rest of the players, you might get a nonsensical answer. Try to fit the observed histogram of goals let in to Poisson, binomial and negative binomial distributions. Does one of those distributions produce a good fit? If that good fit gives an average goals scored per game less than 1, we can use the inverse of that average as a measure of how good the goaltender is. Note, you cannot take the goals scored and divide by the number of games played, that is only fitting a single parameter of a distribution. We need to consider all the data; how often 2, 3, or more are scored.
It is entirely possible that a person can find a good fit, and still not have the analysis generate a measure of how good the goalkeeper is. Not enough shots by the other team, too efficient defending by the outfield players, warped officiating, ….
Assuming Poisson is the best, and an average goals per game of 1 is the best fit. We get the following table of goals per game.
0 . 0.367879
1 . 0.367879
2 . 0.1839397
3 . 0.06131
If you make a histogram, you won’t have fraction values for the Y axis, you will have counts. The thing to observe is the counts for 0 and for 1 are about the same, the counts for 2 is 1/2 that of 1, the counts for 3 is 1/3 that of 2, the counts for 4 is 1/4 that of the counts for 3, and so on.
If the highest number of goals against is 4, and that happened once, We expect the number of 3 goal games to be 4, the number of 2 goal games to be 12, and the number of 1 or 0 goal games to be 24 each. If we take the goaltenders record, and divide it up into 65 game chunks, we expect to see a single 4 goal game, four 3 goal games, twelve 2 goal games and twenty-four each of 0 or 1 goal games. This 65 games is almost two seasons, and we are only expecting a single 4 goal game.
Looking for 5 goal games? Now we are looking at sets of about 300 games, which is about 8 seasons. And we expect to only find a single 5 goal game.
6 goal games? We want to look at sets of about 2000 games (51 years). 7 goal games? Sets of almost 15,000 games. 8 goal games? Sets of almost 110,000 games.
This guessing as to how many games are needed to see a single occurrence of an extreme value depend on the fitted parameters quite sensitively. Even for a Poisson distribution, which only has a single parameter, changing that parameter by a small amount can change how likely this extreme event happens by quite a bit.
Football games are low scoring affairs in general. We expect the largest number of goals attained or given up by teams to be mostly 0 or 1. And yet, if we look through the record books, we see games of 4 or more goals by one team (or even both teams) fairly often. It is very likely the analysis is more complicated than what I have outlined in 2 comments.
I suspect either comment is more words than the journalist used in the first place.
Just goes to prove my point of a few weeks back that you can prove anything by manipulating stats, but I’m sure the Untold thug will find some reason to disagree again.
We’ve been fortunate at having had some of the best keepers around and hopefully the trend will continue.
Did The Telegraph really call Ospina “The Spaniard”?
Sorry but clean sheets are toilet paper.
Goal keepers are very difficult to assess as the defence in front of them play a huge part. The number of crosses intercepted; the number of shots stopped; penalty saves; clearances; creative passes leading to a goal chance; positioning & command of the goal area, all play a part in the assessment.
My opinion is that Chez is the best we have ever had physically but he needs some experience with GK technique & positioning. Perhaps a loan spell in Spain might do him good.
TailGunner – excellent reading skills & good spot.
Gord – I am a menace 😉 . A tender is found on trains. We have keepers to protect our nets from stray balls. As for the hysteria & fishy sexuality of nomials, I’m lost like jambug. 😉
Tailgunner, if by proving things with statistis you mean multiplying all the data by zero and then adding the answer you are looking for, yes there are lots of people doing things with statistics. It is almost impossible to prove anything with statistics, as statistics implies that random chance may provide an unexplainable observation.
Menace, I thought about you and Jambug in writing the above comments. I suspect I may need to justify myself a few times.
Gord,
Hope this doesn’t offend you, but as usual you’ve completely confused me.
Difficult to really assess who the best goalie might have been when there’s a vast difference in number of games played.
But what I find intriguing is the media saying Ospina is rubbish and we need a world class goalie. Some of the arguments being put forward for his high number of clean sheets is that he has been well protected by a good defence. Which is alright. But then they go on to say our defence is rubbish, looks very vulnerable and could concede any second, is not as good as our rivals’, we need a world class defender…… Hmmmm, something is not adding up here.
Apologies for copying and pasting but saw this comment by Rich on a thread that’s about three days old which I thought was brilliant.
http://untold-arsenal.com/archives/43233#comment-841552
Statistics aren’t useless here but there has to be some qualitative analysis of the goals conceded, too. In many respects a keeper and the back line should be judged together; they concede goals together even if the last line of defence is the keeper. It is hard to pin Monday’s goal soley on Ospina. And, things don’t EVEN OUT. Of course, at some stage you do have to judge between keepers but clean sheets is certainly not the way.
Jack Kelsey may be regarded by many as our best – ever keeper, because he played through the decade when Arsenal had one of their poorest records. According to some accounts from the time, he was the one world-class player at the club and is credited with single-handedly keeping us in the top division.
It follows from this that his statistics will not be so impressive, so proving that the question of who is the best cannot be judged on figures alone.
Nope, no offense. What confused you? The business about multiplying by 0 and adding the answer I am looking for? About not being able to prove things?
I’m having a dumb moment. English keeper, who I believed played for Derby. Nominally, Derby’s defense was bad, and consequently that keeper got many more shots than others. Which is similar to what John says.
I think the first step is to try and unravel the number of shots on target, number of shots blocked by the defending team, number of shots deflected by the defending team, number of shots wide, number of other shots and defensive mistakes (such as own goals) to try and partition goals observed into goals that a goaltender was reasonably expected to save, from other goals that also occurred. I suspect that that gets rid a big number of games like 5-5 and 8-2 that should almost never occur.
Then, if the goalkeeper in question has played enough games that they have a small number of games where 4 or 5 goals were scored, trying to fit the distribution should give a reasonably good estimate as to what the average number of goals allowed per game would be. And I will still suggest that goalkeepers get compared on the inverse of that score.
But it may be better to just give up the idea of games, and look at the probability of stopping a goal based on the idea that a shot has been made (probability of scoring per shot). If nothing else, that would allow for some input from practice sessions.
@Al – it took a friend of mine to open my eyes to how important it was for rednose to overtake the history of Shankly & Liverpool. It is so obvious that it blinds common sense. He also wanted the undefeated season but no one can get it that easy. It was a special team with special qualities instilled by Wenger that did that.
When physics people make measurements, you usually get good numbers and a good error analysis. Ars Technica is running an article on bias development in estimates of sea level made by satellites, that I will pull a part of a paragraph from:
> This gets interesting for a couple of reasons. The first is that instead of a slight deceleration in the measured rate of sea level rise of -0.057 ± 0.058 millimeters per year, you get a slight acceleration of +0.041 ± 0.058 millimeters per year. The error bars of each obviously overlap with zero, but it’s a significant shift.
If a person was doing linear least squares on some data, and the Y intercept had a value of something like -0.057 +/- 0.058, what a person would do is find out the multiplier on the intercept uncertainty (0.058), and then ask the question as to whether the number 0 was included in the expanded range. For the data sets I’ve typically run across, that multiplier is often around 2. But just looking at the estimated Y intercept and its precision, a multiplier of 1 would still allow the range to contain 0.
Because the range includes 0 (because 0 has special meaning), you could not accept a value different from 0, and hence one would have to change the estimated regression line from what was originally calculated, to a different line which goes through the points (0,0) and (, ) where the angle brackets mean find the average value of.
In the particular instance of the story at Ars Technica, neither the previous measurement or the new measurement could be said to be statistically different from 0. But I suspect the measurements were of such a nature, that values could be paired up for all possible pairs, and one could look at the difference. Or more particularly, the sign of the difference. And they observed a consistent difference between the number of positive and negative signs. While you couldn’t say that either value is significantly different from 0, you can say the second value is significantly bigger than the first value.
And maybe that goes a little way towards my statement about not being able to prove things with statistics?
Grrr, wordpress eats angle brackets. So above, I have:
(0,0) and (, ) where the angle brackets mean find the average value of
replace (, ) with (x_average, y_average).
Gord – while you were doing the stats, Juventus just went through to the Champions League Final v Barca. I was not impressed by some of the refereeing but hey nothing new there. I don’t think football can survive without some cheating. The time wasting & game ‘management’ techniques are just so blatant.
Cheating?
Someone told the BBC that Gerrard is going to make a fine manager. And we all know that he cheats, so what does that say about the manager who told the BBC this? 🙂
Looking at News, some Manchester site has this for a headline:
> Manchester United: Arsenal to target Fellaini at Old Trafford
You can be damned sure, that Arsenal does not define _target_ the same way that Paddy McAsshole does.
True that, Menace.
Menace
That subject’s been on my mind a lot in the last fortnight or so (when I’ve watched an absurd amount of football)
Bit of a quandary really : I detest the cheating ; it is not going away; it puts us at a huge disadvantage being about the only one of europe’s top clubs who doesn’t really indulge, let alone have it- craftiness or outright cheating- as deeply embedded into their dna as any other quality is.
I noticed with Barca yesterday : even with the game undoubtedly won, they can’t stop themselves, it is automatic : on at least three occasions they went through their great pantomimes of pain (searching for bookings) when a stray arm brushed their back or head.
A real eye-opener came when I read a book not long ago about Wenger’s first couple of seasons. One of the games covered in it was a 2:2 draw against Bayern, which I have no memories of at all: apparently we played very well but were undone by a negative Bayern team who were constantly diving and and were ,unfortunately, well rewarded for it. ‘Craftiness’ got them through (it might have been the 98-99 season!)
It’s a worry and a test of the old footballing morals of mine. Our most rewarding win of the season- city- featured a bit of that craftiness from Monreal. Immaculate honesty and there was no pen.
First, we need to keep up the improvement before we can think of going deep in the champions league; but even then we would do so while knowing that in every game we lose a few per cent of our chance of winning thanks to the other team being ‘crafty’ and us hardly at all.
Oh well, worry about taking those few extra steps on the football side first, I guess.
I know the modern trend is to have data and numbers for everything. Not being a professional coach who am I to dispute the merits of such anysis. But I’m Irish and we’ve been known to even argue when talking to ourselves at times so here I am disputing the merits of just using numbers in the evaluation of a goalkeeper. I did play there for many years so feel a little bit at home on the topic.
In my opinion, it’s a far more subjective judgement and requires that the keeper is assessed over time on and off the pitch and over a range of qualities-
He should have a ‘presence’ in the company of his peers, both on and off the field.
He should be a leader, physically and mentally strong (big helps but not essential).
He should read the game better than most (a keeper’s best work is often done by having his defenders snuff out the danger before an attack produces a shot).
He should be a potent attacking option on occasion when the right distribution can result in a chance for his striker to score.
He should be able to concentrate for every minute of a match, have total self belief (to a point of what may seem like arrogance). This allows him to put a setback out of his mind and move on.
He should be calculating and disciplined, never getting riled or deflected from his task.
I’ve been a Gooner since 1970 and for me, David Seaman is the closest we’ve had to the above, followed by Pat Jennings. Szczesny might get there yet, he’s a work in progress. David ‘The Spaniard’ Ospina is a decent keeper, but not around long enough to judge.
Beyond Arsenal, Shilton, Schmeichel Snr. and Southall were the best I’ve seen.
Sorry Gord, I tried to follow the numbers but now my head hurts and I think I need a lie down for a while.
Dec- good summary – & safe hands with confidence. You missed out a great undefeated -Jens Lehmann. He had to accept stamps on both feet & survived PGMO blindness.
Didn’t forget Jens, he was quality but, I felt sometimes lacked the cold, detached discipline to make the top table.
Dec
Sorry for giving you a headache.
I am not a professional coach, athletic first aid person, referee or player.
And yet, I’ve done athletic first aid at the highest levels and been a
referee and more a player. And I know a little about numbers.
I agree with most of what you have written.
Instead of presence, I would say a goalkeeper needs to be vocal. In Canada,
we have a problem finding vocal goalkeepers.
The thing I read about Lehman, is that he studied players on other
teams. He wanted to know if they headed the ball, kicked the ball,
left, right, high, low, left leg, right leg and so on.
Anyone has Jens’ saves stats?
No worries, head’s fine Gord. Jens was a class act no doubt. You don’t have an unbeaten season with anything less in goal. Attention to detail is so important. Being vocal is a means of getting a command across, but unless the keeper is in charge his vocals can be ignored or ‘unheard’. When in refer to a ‘presence’ I mean it to be a total command of the defensive effort. The outfield defenders need to accept this state and know intuitively how best to be of service. Being the last man back, the keeper has the best tactical picture of the state of play and can best organise things.
Amazingly, there are some heretics who believe Football is team game with 11 players a side.
Us keepers know that it’s really an individual sport, where 2 opposing goalkeepers direct 10 lesser gifted individuals around a pitch in an effort to beat each other. Sort of like Chess with a ball.
Some use the rant and rave approach (remember Les Sealy of ManU) while others use a little more diplomacy and let the outfield drones feel important.
Wonder how I never made it as a professional coach, hmmm.
Dec
Having good eyesight is also useful for goaltenders.
I have played every position, including a lot at goal. I have astigmatism, but my astigmatism is almost 90 degrees off what most people have. My contact lenses did not correct for astimatism. One time, someone talk a shot from just inside the half line, near the touch line. My astigmatism made me mis-judge the ball’s path, and they scored. Embarassing.
The men’s club that I did the most athletic first aid for, had a goalkeeper who was originally from Belfast. He would occassionally come back from Gaelic tournaments with the strangest injuries.
He had no problem commanding the defence, and one of his centrebacks was from Manchester originally.
This is in western Canada (Edmonton, Alberta).
Brilliant, you’d be a handy man to take a penalty I’ll bet.
They certainly build them tough in Belfast. Usually best not to argue too much. Gaelic football certainly helps with the handling skills etc.
Pat Jennings was testament to that.
Dec
I’ve saved a few penalties, scored on one (playing fullback that game). I could almost do the splits, so I tended to use my legs to get to low balls to the side. But, I always tried to wait until the person taking the penalty actually hit the ball before I started to move. Just inside the post would beat me, straight down the center wouldn’t.
Yes, I was surprised you made no mention of Pat Jennings, the convert from Spurs, who saw the light and had a second coming by signing for the best club in London.
It will always be Pat Jennings for me . The fact that he came from the Spuds as their reject and did very well for us , was an added bonus.
David Seaman a very close number two.
Very disappointed not to be included in your analysis Tony …
Played 3 games at a charity event last year and kept a clean sheet so my ratio at the Emirates is 33% … admittedly I conceded one every 5 minutes but you can have everything !!!
I also share a birthday with both Szczesny and Fabianski, which must count for something … even if I’d had a few more than both of them 🙂
https://www.facebook.com/photo.php?fbid=10203431903494716&set=pb.1126437165.-2207520000.1431595428.&type=3&theater
It is silly to try to take all aspects of GK.
Take the more important stats:
Number of consecutive games played: points per game.
Number of consecutive games with clean sheet: give points per game.
Type of goal let in:
1, long range volley
2, short range volley
3, inside the box shot
4, header from corner
5, header
6, dribbled goal
7, (when playing against Arsenal) 🙂 multi-player goal
: work out points for each type of goal according to difficulty to save.
These three data blocks can tell quite a lot about the GK.
Well, it looks like the Daily Mail has us all beat. The best indicator of who the best goalkeeper is, is their ability to play the piano while wearing goalkeeping gloves.
http://www.dailymail.co.uk/sport/football/article-3081311/Arsenal-stars-Santi-Cazorla-Tomas-Rosicky-Wojciech-Szczesny-musical-talent-Goonervision-Song-Contest.html
Who else is in the Goonervision?
I think goalkeepers should be treated as every other players in different position. Which means his attribute and gameplay contribute to the team’s style of play. Which is why I believe Ospina will be no. 1 GK next season. He deserves it. His calm presence and simplicity suited the defenders in front of him. The current form suggests that. Personally, by talent and performance, Jens Lehmann is the best Arsenal GK in PL history. Crazy but one tough one of a gun, a leader and bloody hell, he can pull stunning saves. Nothing against Seaman but Moustache Dave did the occasional mistakes sometimes too often. Lehmann was mentally strong and kept improving himself. Szech is still young and maybe he should continue learning and improving himself. He can be the best, only he can deny that.
I never played in goal, but for me the best keepers were the ones I felt comfortable standing behind on the North Bank. Seaman ,Lehmann, Jennings , Kelsey. Next level Lukic , Wilson , Manninger . all the way down to the camp comic Fumbling Jim Furnell, We have had some good ones and some not so good but they all played for The Arsenal and that was good enough for me.
I think that Szczesny may well go , I hope he doesn’t , but either his own or his father’s pride might make him take a decision that he may regret later on.