Finally this week I got around to doing something I’d been meaning to do for ages. Pull all the data from games to see what a goalkeeper spends his time doing during a match.
These are the on-ball actions of keepers in the Premier League so far this season: ‘Aerial’, ‘Ball Recovery’, ‘Ball touch’, ‘Challenge’, ‘Claim’, ‘Clearance’, ‘Cross not claimed’, ‘Error’,’Free Kick’, ‘Keeper pick-up’, ‘Keeper Sweeper’, ‘Pass’, ‘Punch’, ‘Save’, ‘Smother’, ‘Take on’
I picked a random Everton game which turned out to be v Man City on New Years Day. I grabbed the data and got the video ready.
The first on-ball event listed for Pickford was a ‘Keeper pick-up’. I roll the video. He actually mishandles a cross, drops it and then gathers safely:
This isn’t pristine data direct from source, it’s swiped. Where’s the ‘Cross not claimed’ event? Does the ‘proper’ actual source data actually have more than ‘Keeper pick-up’?
Next is a little ‘Throw-out’ (a subset of a ‘Pass’ on the data set) to Yerry Mina which all looks fine on the video.
Next come eight passes taking us to 6 minutes 23 seconds in. They’re all pretty standard too, no video needed.
On 10 minutes 12 seconds we have an ‘Aerial’ and ‘Ball touch’. Looks like another mishandled cross to me that gets dropped. Yet still no sign of the ‘Cross not claimed’ event in the data:
Four more passes and there’s over 20 minutes gone.
Not long after, however, we get something interesting on the data. Yay! This is what the ‘Sweeper Keeper’ event looks like, swiftly followed by a ‘Clearance’ that Peter Kay would be proud of:
After seven more passes (including a goal kick), Pickford finally makes a ‘Save’ 28 minutes into the game. He then makes another ‘Save’ here a minute later followed by another ‘Keeper pick-up’:
Four more passes for wor Jordan over the next 6 minutes. And then another ‘Save’ followed by another ‘Keeper pick-up’ in the data. If you can see any ‘Keeper pick-up’ here you’ve got better eyes than me:
At this stage, I’m starting to question the data.
Intrigued, I search further back this season for a ‘Cross not claimed’ for Pickford. That being a cross not caught. According to the data, Pickford hasn’t fumbled one all season. Hmm. Do they even exist?!
I go away and eventually find one. Sorry Bernd Leno. But at least you’ve proved they exist in the data set. This is an ‘Aerial’ and ‘Cross not claimed’:
So I’m getting the ‘Aerial’ thing as a duel with an opponent going for the same ball. But Leno isn’t even trying to catch the ball here. He’s trying to punch it.
In the Pickford vids up above he’s actually trying to catch the thing. And doesn’t. Yet those aren’t labelled as ‘Cross not claimed’. Bizarre.
At this stage, I’m almost ready to throw the data in the bin.
I hadn’t cherry picked this game to highlight this point about bad data, I just stumbled across it. It’s literally the first game I looked at for the real point of this article. Maybe I just picked a bad ‘un and the rest are fine…
Obviously, there are implications here if you want to use the data to compare keepers across the board against various skill sets. Considering the definitions of these same actions by other data and video providers appear to be different (judging by the amount of missing information), the whole thing is a minefield.
The data is supposed to help filter down the number of players scouts need to watch. Clearly, it doesn’t, because to do it properly you have to watch every damn minute again.
The real point of this article was to make a small point about evidence-based coaching. If 75% of the game for a keeper these days is distribution, how much time do coaches spend on it in training?
I recently spoke with Ostersunds FK keeper coach, David Preece, about making sessions more like matches.
“If we do too many game-realistic drills in training, Aly (Keita) doesn’t feel like he’s had enough of a work-out. It’s because it takes time to prepare and set it all up. At the end he’s like: ‘I’ve only made 20 to 30 saves!’ He wants to make more. Finding a balance is difficult.”
As a data guy, I’m not advocating spending 75% of training time on a keeper’s distribution. Goals win games, so making saves is always going to weight more heavily in terms of action importance. Working on the expected save data all these years makes me think saves are even more important (and difficult) than most people think they are.
Head of Goalkeeping at the FA and England, Tim Dittmer, said this:
“Actions from the game represent different ‘ratings’ in terms of impact and importance. This would help guide us to what training looks like and how much we do of certain topics and actions.”
Having personally been a thorough skeptic in the past about how important a keeper being good with the ball at their feet is, I’ve mellowed on it. Especially with the change in law this year with keepers being allowed to play goal-kicks to players inside their own box.
I send Tim a picture of a tweet I sent:
“Funny you should ask. We did a session that looked just like this today.”
I try and pick holes. Was it a stand alone keeper session, or did it include outfield players as well? If it didn’t it seems a bit sub-optimal?
Then comes the touché:
“The session had both keepers and outfield players included and two coaches took it. Your ‘sub-optimal’ comment would suggest that anything other than 11 v 11 on a full-size pitch falls into this category.”
It’s pretty clear that practical obstacles to evidence-based goalkeeper coaching exist even in the professional game at the highest levels. Realistically, you do the best you can, and plan sessions to suit the resources available and keep your goalkeepers happy and on board.
“Of the 75% stat of the game being distribution, only 20% of these are regarded as ‘under pressure’. Firstly, what should that mean to the amount of meaningful time we spend on this area and secondly, what are the types of cues, triggers and pressure these actions need to be practiced within?”
This is the point to me. Ask questions of what you’re doing and why.
Feedback from other coaches was muted and mixed. Every single one I spoke to refused to even make a rough % guess of what time distribution made up their training routines. It doesn’t really work like that apparently.
I still feel like there’s too many sessions apart from the the rest of the team. Too many sessions not game specific enough because of resources available or because keepers themselves get restless.
As David said, maintaining a balance between what’s right, good, available and acceptable to players makes coaching a difficult business. It needs collaboration across the club. How do we stop making goalkeeping so much of a lonely business?
Maybe everyone’s too busy making fancy looking videos for YouTube that the rest of the Goalkeeper’s Union snipe away at on twitter. Get together and make things better. I’m not even a coach and two of the best national team goalkeeper coaches and coach educators in the world give their time to me….
One response to “On Goalkeeping Data, Scouting and Evidence Based Coaching”
[…] Soccer: A StatsBomb Investigation ponders Are the Foxes at risk of getting caught? FiveThirtyEight: Without Manchester City (For Now), The Race To Make The Champions League Is Wide Open. American Soccer Analysis on CBA Talk: Players miss opportunity to take a fair share of league revenues. Paul Riley’s post On Goalkeeping Data, Scouting and Evidence Based Coaching. […]